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Preface 


The ideas and concepts of physics are best expressed in the language of mathe- 
matics. But this language is far from unique. Many different algebraic systems 
exist and are in use today, all with their own advantages and disadvantages. In 
this book we describe what we believe to be the most powerful available mathe- 
matical system developed to date. This is geometric algebra, which is presented 
as anew mathematical tool to add to your existing set as either a theoretician or 
experimentalist. Our aim is to introduce the new techniques via their applica- 
tions, rather than as purely formal mathematics. These applications are diverse, 
and throughout we emphasise the unity of the mathematics underpinning each 
of these topics. 

The history of geometric algebra is one of the more unusual tales in the de- 
velopment of mathematical physics. William Kingdon Clifford introduced his 
geometric algebra in the 1870s, building on the earlier work of Hamilton and 
Grassmann. It is clear from his writing that Clifford intended his algebra to 
describe the geometric properties of vectors, planes and higher-dimensional ob- 
jects. But most physicists first encounter the algebra in the guise of the Pauli 
and Dirac matrix algebras of quantum theory. Few then contemplate using these 
unwieldy matrices for practical geometric computing. Indeed, some physicists 
come away from a study of Dirac theory with the view that Clifford’s algebra 
is inherently quantum-mechanical. In this book we aim to dispel this belief by 
giving a straightforward introduction to this new and fundamentally different 
approach to vectors and vector multiplication. In this language much of the 
standard subject matter taught to physicists can be formulated in an elegant 
and highly condensed fashion. And the portability of the techniques we discuss 
enables us to reach a range of advanced topics with little extra work. 

This book is intended to be of interest to both students and researchers in 
physics. The early chapters grew out of an undergraduate lecture course that we 
have run for a number of years in the Physics Department at Cambridge Uni- 
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versity. We are indebted to the students who attended the early versions of this 
course, and helped to shape the material into a form suitable for undergraduate 
tuition. These early chapters require little more than a basic knowledge of linear 
algebra and vector geometry, and some exposure to classical mechanics. More 
advanced physical concepts are introduced as the book progresses. 

A number of themes run throughout this book. The first is that geometric 
algebra enables us to express fundamental physics in a language that is free from 
coordinates or indices. Coordinates are only introduced later, when the geom- 
etry of a given problem is clear. This approach gives many equations a degree 
of clarity which is lost in tensor algebra. A second theme is the way in which 
rotations are handled in geometric algebra through the use of rotors. This ap- 
proach extends to arbitrary spaces the idea of using a complex phase to rotate in 
a plane. Rotor techniques can be applied in spaces of arbitrary signature and are 


particularly well suited to formulating Lorentz and conformal transformations. 
The latter are central to our treatment of non-Euclidean geometry. Rotors also 
provide a framework for studying Lie groups and Lie algebras, and are essential 
to our discussion of gauge theories. 

The third theme is the invertibility of the geometric product of vectors, which 
makes it possible to divide by a vector. This idea extends to the vector derivative, 
which has an inverse in the form a first-order Green’s function. The vector 
derivative and its inverse enable us to extend complex analytic function theory 
to arbitrary dimensions. This theory is perfectly suited to electromagnetism, 
as all four Maxwell equations can be combined into a single spacetime equation 
involving the invertible vector derivative. The same vector derivative appears 
in the Dirac theory, and is central to the gauge treatment of gravitation which 
dominates the final two chapters of this book. 

This book would not have been possible without the help and encouragement 
of a large number of people. We thank Stephen Gull for helping initiate much 
of the research described here, for his constant advice and criticism, and for use 
of a number of his figures. We also thank David Hestenes for all his work in 
shaping the modern subject of geometric algebra and for his constant encour- 
agement. Special mention must be made of our many collaborators, in particular 
Joan Lasenby, Anthony Challinor, Leo Dorst, Tim Havel, Antony Lewis, Mark 
Ashdown, Frank Sommen, Shyamal Somaroo, Jeff Tomasi, Bill Fitzgerald, Youri 
Dabrowski and Mike Hobson. Special thanks also goes to Mike for his help with 
Latex and explaining the intricacies of the CUP style files. We thank the Physics 
Department of Cambridge University for the use of their facilities, and for the 
range of technical advice and expertise we regularly called on. Finally we thank 
everyone at Cambridge University Press who helped in the production of this 
book. 

CD would also like to thank the EPSRC and Sidney Sussex College for their 
support, his friends and colleagues, all at Nomads HC, and above all Helen for 
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not complaining about the lost evenings as I worked on this book. I promise to 
finish the decorating now it is complete. 

AL thanks Joan and his children Robert and Alison for their constant enthu- 
siasm and support, and their patience in the face of many explanations of topics 
from this book. 


Cambridge C.J.L. Doran 
July 2002 A.N. Lasenby 
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Notation 


The subject of vector geometry in general, and geometric algebra in particular, 
suffers from a profusion of notations and conventions. In short, there is no 
single convention that is perfectly suited to the entire range of applications of 
geometric algebra. For example, many of the formulae and results given in 
this book involve arbitrary numbers of vectors and are valid in vector spaces 
of arbitrary dimensions. These formulae invariably look neater if one does not 
embolden all of the vectors in the expression. For this reason we typically choose 
to write vectors in a lower case italic script, a, and more general multivectors in 
upper case italic script, M. But in some applications, particularly mechanics and 
dynamics, one often needs to reserve lower case italic symbols for coordinates 
and scalars, and in these situations writing vectors in bold face is helpful. This 
convention in adopted in chapter 3. 

For many applications it is useful to have a notation which distinguishes frame 
vectors from general vectors. In these cases we write the former in an upright 
font as {e;}. But this notation looks clumsy in certain settings, and is not 
followed rigorously in some of the later chapters. In this book our policy is to 
ensure that we adopt a consistent notation within each chapter, and any new or 
distinct features are explained either at the start of the chapter or at their point 
of introduction. 

Some conventions are universally adopted throughout this book, and for con- 
venience we have gathered together a number of these here. 


(i) The geometric (or Clifford) algebra generated by the vector space of sig- 
nature (p,q) is denoted G(p,q). In the first three chapters we employ the 
abbreviations Gz and G3 for the Euclidean algebras G(2,0) and G(3,0). In 
chapter 4 we use Gn to denote all algebras G(p, q) of total dimension n. 

(ii) The geometric product of A and B is denoted by juxtaposition, AB. 

(iii) The inner product is written with a centred dot, A- B. The inner product 
is only employed between homogeneous multivectors. 
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(iv) 


(ix) 


The outer (exterior) product is written with a wedge, AA B. The outer 
product is also only employed between homogeneous multivectors. 

Inner and outer products are always performed before geometric prod- 
ucts. This enables us to remove unnecessary brackets. For example, the 
expression a-bc is to be read as (a-b)c. 

Angled brackets (M), are used to denote the result of projecting onto the 
terms in M of grade p. The subscript zero is dropped for the projection 
onto the scalar part. 

The reverse of the multivector M is denoted either with a dagger, MÌ, or 
with a tilde, M. The latter is employed for applications in spacetime. 
Linear functions are written in an upright font as F(a) or h(a). This 
helps to distinguish linear functions from multivectors. Some exceptions 
are encountered in chapters 13 and 14, where caligraphic symbols are 
used for certain tensors in gravitation. The adjoint of a linear function is 
denoted with a bar, h(a). 

Lie groups are written in capital, Roman font as in SU(n). The corre- 
sponding Lie algebra is written in lower case, su(n). 


Further details concerning the conventions adopted in this book can be found 
in sections 2.5 and 4.1. 
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Introduction 


The goal of expressing geometrical relationships through algebraic equations has 
dominated much of the development of mathematics. This line of thinking goes 
back to the ancient Greeks, who constructed a set of geometric laws to describe 
the world as they saw it. Their view of geometry was largely unchallenged 
until the eighteenth century, when mathematicians discovered new geometries 
with different properties from the Greeks’ Euclidean geometry. Each of these 
new geometries had distinct algebraic properties, and a major preoccupation 
of nineteenth century mathematicians was to place these geometries within a 
unified algebraic framework. One of the key insights in this process was made by 
W.K. Clifford, and this book is concerned with the implications of his discovery. 

Before we describe Clifford’s discovery (in chapter 2) we have gathered to- 
gether some introductory material of use throughout this book. This chapter 
revises basic notions of vector spaces, emphasising pictorial representations of 
the underlying algebraic rules — a theme which dominates this book. The ma- 
terial is presented in a way which sets the scene for the introduction of Clifford’s 
product, in part by reflecting the state of play when Clifford conducted his re- 
search. To this end, much of this chapter is devoted to studying the various 
products that can be defined between vectors. These include the scalar and 
vector products familiar from three-dimensional geometry, and the complex and 
quaternion products. We also introduce the outer or exterior product, though 
this is covered in greater depth in later chapters. The material in this chapter is 
intended to be fairly basic, and those impatient to uncover Clifford’s insight may 
want to jump straight to chapter 2. Readers unfamiliar with the outer product 
are encouraged to read this chapter, however, as it is crucial to understanding 
Clifford’s discovery. 


INTRODUCTION 


1.1 Vector (linear) spaces 


At the heart of much of geometric algebra lies the idea of vector, or linear spaces. 
Some properties of these are summarised here and assumed throughout this book. 
In this section we talk in terms of vector spaces, as this is the more common 
term. For all other occurrences, however, we prefer to use the term linear space. 
This is because the term ‘vector’ has a very specific meaning within geometric 
algebra (as the grade-1 elements of the algebra). 


1.1.1 Properties 


Vector spaces are defined in terms of two objects. These are the vectors, which 
can often be visualised as directions in space, and the scalars, which are usually 
taken to be the real numbers. The vectors have a simple addition operation rule 
with the following obvious properties: 


(i) Addition is commutative: 
a+b=b+a. (1.1) 
(ii) Addition is associative: 
a+ (b+c)=(a+b)+e. (1.2) 


This property enables us to write expressions such as a + b + c without 
ambiguity. 
(iii) There is an identity element, denoted 0: 


a+0=a. (1.3) 
(iv) Every element a has an inverse —a: 
a+ (—a) =0. (1.4) 


For the case of directed line segments each of these properties has a clear geo- 
metric equivalent. These are illustrated in figure 1.1. 

Vector spaces also contain a multiplication operation between the scalars and 
the vectors. This has the property that for any scalar \ and vector a, the product 
Aa is also a member of the vector space. Geometrically, this corresponds to the 
dilation operation. The following further properties also hold for any scalars A, u 
and vectors a and b: 


(i) Ala + b) = Aa + AD; 
(i 
) (Ap)a = Kuan, 
(iv) if 1A = à for all scalars \ then la = a for all vectors a. 
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1.1 VECTOR (LINEAR) SPACES 


Figure 1.1 A geometric picture of vector addition. The result of a + b is 
formed by adding the tail of b to the head of a. As is shown, the resultant 
vector a + b is the same as b+ a. This finds an algebraic expression in the 
statement that addition is commutative. In the right-hand diagram the 
vector a + b + is constructed two different ways, as a + (b+ c) and as 
(a+b)+c. The fact that the results are the same is a geometric expression 
of the associativity of vector addition. 


The preceding set of rules serves to define a vector space completely. Note that 

the + operation connecting scalars is different from the + operation connecting 

the vectors. There is no ambiguity, however, in using the same symbol for both. 
The following two definitions will be useful later in this book: 


(i) Two vector spaces are said to be isomorphic if their elements can be 
placed in a one-to-one correspondence which preserves sums, and there 
is a one-to-one correspondence between the scalars which preserves sums 
and products. 

(ii) If U and V are two vector spaces (sharing the same scalars) and all the 
elements of U are contained in VY, then U is said to form a subspace of V. 


1.1.2 Bases and dimension 


The concept of dimension is intuitive for simple vector spaces — lines are one- 
dimensional, planes are two-dimensional, and so on. Equipped with the axioms 
of a vector space we can proceed to a formal definition of the dimension of a 
vector space. First we need to define some terms. 


(i) A vector b is said to be a linear combination of the vectors a1,...,@n if 
scalars A,,...,An can be found such that 
n 
b= May +H Anan = > Asai. (1.5) 
i=1 
(ii) A set of vectors {a1,...,G@n} is said to be linearly dependent if scalars 
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A1,--+;An (not all zero) can be found such that 
Aya, +++ + Anan = 0. (1.6) 


If such a set of scalars cannot be found, the vectors are said to be linearly 
independent. 

(iii) A set of vectors {a1,...,@,} is said to span a vector space V if every 
element of V can be expressed as a linear combination of the set. 

(iv) A set of vectors which are both linearly independent and span the space 
Y are said to form a basis for V. 


These definitions all carry an obvious, intuitive picture if one thinks of vectors 
in a plane or in three-dimensional space. For example, it is clear that two 
independent vectors in a plane provide a basis for all vectors in that plane, 
whereas any three vectors in the plane are linearly dependent. These axioms and 
definitions are sufficient to prove the basis theorem, which states that all bases 
of a vector space have the same number of elements. This number is called the 
dimension of the space. Proofs of this statement can be found in any textbook 
on linear algebra, and a sample proof is left to work through as an exercise. Note 
that any two vector spaces of the same dimension and over the same field are 
isomorphic. 

The axioms for a vector space define an abstract mathematical entity which 
is already well equipped for studying problems in geometry. In so doing we are 
not compelled to interpret the elements of the vector space as displacements. 
Often different interpretations can be attached to isomorphic spaces, leading to 
different types of geometry (affine, projective, finite, etc.). For most problems 
in physics, however, we need to be able to do more than just add the elements 
of a vector space; we need to multiply them in various ways as well. This is 
necessary to formalise concepts such as angles and lengths and to construct 
higher-dimensional surfaces from simple vectors. 

Constructing suitable products was a major concern of nineteenth century 
mathematicians, and the concepts they introduced are integral to modern math- 
ematical physics. In the following sections we study some of the basic concepts 
that were successfully formulated in this period. The culmination of this work, 
Clifford’s geometric product, is introduced separately in chapter 2. At various 
points in this book we will see how the products defined in this section can all 
be viewed as special cases of Clifford’s geometric product. 


1.2 The scalar product 


Euclidean geometry deals with concepts such as lines, circles and perpendicular- 
ity. In order to arrive at Euclidean geometry we need to add two new concepts 
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to our vector space. These are distances between points, which allow us to de- 
fine a circle, and angles between vectors so that we can say that two lines are 
perpendicular. The introduction of a scalar product achieves both of these goals. 

Given any two vectors a, b, the scalar product a-b is a rule for obtaining a 
number with the following properties: 


(i) a-b = b-a; 

(ii) a- (Ab) = A(a-b); 
(iii) a- (b + c) = a-b + a-c; 
(iv) a-a > 0, unless a = 0. 


(When we study relativity, this final property will be relaxed.) The introduction 
of a scalar product allows us to define the length of a vector, |a|, by 


ja| = /(a-a). (1.7) 


Here, and throughout this book, the positive square root is always implied by 
the y symbol. The fact that we now have a definition of lengths and distances 
means that we have specified a metric space. Many different types of metric 
space can be constructed, of which the simplest are the Euclidean spaces we 
have just defined. 

The fact that for Euclidean space the inner product is positive-definite means 
that we have a Schwarz inequality of the form 


a-b] < Ja] lol; (1.8) 
The proof is straightforward: 
(a+ àb) (a+ Ab) > 0 VA 


=> a-a+2ra-b + 7b-b > 0 VA 
=> (a-b)? <a-ab-b, (1.9) 
where the last step follows by taking the discriminant of the quadratic in A. 


Since all of the numbers in this inequality are positive we recover (1.8). We can 
now define the angle 0 between a and b by 


a-b = |a||b| cos(6). (1.10) 


Two vectors whose scalar product is zero are said to be orthogonal. It is usually 
convenient to work with bases in which all of the vectors are mutually orthogonal. 
If all of the basis vectors are further normalised to have unit length, they are 
said to form an orthonormal basis. If the set of vectors {e1,..., én} denote such 
a basis, the statement that the basis is orthonormal can be summarised as 


ej; = O4;- (1.11) 
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Here the 6;; is the Kronecker delta function, defined by 


is bea 
j= EA (1.12) 
0 ifiżj. 


We can expand any vector a in this basis as 
n 
a= > aje; = Qi€i, (1.13) 
i=l 


where we have started to employ the Einstein summation convention that pairs 
of indices in any expression are summed over. This convention will be assumed 
throughout this book. The {a;} are the components of the vector a in the {e;} 
basis. These are found simply by 


Qi = ea. (1.14) 


The scalar product of two vectors a = a;e; and b = b;e; can now written simply 
as 


In spaces where the inner product is not positive-definite, such as Minkowski 
spacetime, there is no equivalent version of the Schwarz inequality. In such cases 
it is often only possible to define an ‘angle’ between vectors by replacing the 
cosine function with a cosh function. In these cases we can still introduce ortho- 
normal frames and use these to compute scalar products. The main modification 
is that the Kronecker delta is replaced by ni; which again is zero if i Æ j, but 
can take values +1 if 7 = j. 


1.3 Complex numbers 


The scalar product is the simplest product one can define between vectors, and 
once such a product is defined one can formulate many of the key concepts of 
Euclidean geometry. But this is by no means the only product that can be defined 
between vectors. In two dimensions a new product can be defined via complex 
arithmetic. A complex number can be viewed as an ordered pair of real numbers 
which represents a direction in the complex plane, as was realised by Wessel in 
1797. Their product enables complex numbers to perform geometric operations, 
such as rotations and dilations. But suppose that we take the complex number 
z = x + iy and square it, forming 


2° = (x + iy} = x° — y? + 2ryi. (1.16) 


In terms of vector arithmetic, neither the real nor imaginary parts of this ex- 
pression have any geometric significance. A more geometrically useful product 
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is defined instead by 
zz* = (x + ty)(x — iy) = a? + y’, (1.17) 


which returns the square of the length of the vector. A product of two vectors 
in a plane, z and w = u + vi, can therefore be constructed as 


zw* = (x +iy)(u — iv) = xu + vy + i(uy — vz). (1.18) 


The real part of the right-hand side recovers the scalar product. To understand 
the imaginary term consider the polar representation 


z= |z, w=|wle” (1.19) 
so that 
zw" = |z\|wle"— 9), (1.20) 


The imaginary term has magnitude |z||w|sin(@ — ¢), where 0 — ¢ is the angle 
between the two vectors. The magnitude of this term is therefore the area of 
the parallelogram defined by z and w. The sign of the term conveys information 
about the handedness of the area element swept out by the two vectors. This 
will be defined more carefully in section 1.6. 

We thus have a satisfactory interpretation for both the real and imaginary 
parts of the product zw*. The surprising feature is that these are still both parts 
of acomplex number. We thus have a second interpretation for complex addition, 
as a sum between scalar objects and objects representing plane segments. The 
advantages of adding these together are precisely the advantages of working with 
complex numbers as opposed to pairs of real numbers. This is a theme to which 
we shall return regularly in following chapters. 


1.4 Quaternions 


The fact that complex arithmetic can be viewed as representing a product for 
vectors in a plane carries with it a further advantage — it allows us to divide 
by a vector. Generalising this to three dimensions was a major preoccupation 
of the physicist W.R. Hamilton (see figure 1.2). Since a complex number zx + iy 
can be represented by two rectangular axes on a plane it seemed reasonable to 
represent directions in space by a triplet consisting of one real and two complex 
numbers. These can be written as ¢+iy+jz, where the third term jz represents 
a third axis perpendicular to the other two. The complex numbers 7 and j have 
the properties that i? = j? = —1. The norm for such a triplet would then be 


(x + iy + jz)(e — ty — jz) = (2° + y? + 2°) — yz(ij + ji). (1.21) 
The final term is problematic, as one would like to recover the scalar product 
here. The obvious solution to this problem is to set ij = —ji so that the last 


term vanishes. 
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Figure 1.2 William Rowan Hamilton 1805-1865. Inventor of quaternions, 
and one of the key scientific figures of the nineteenth century. He spent 
many years frustrated at being unable to extend his theory of couples of 
numbers (complex numbers) to three dimensions. In the autumn of 1843 
he returned to this problem, quite possibly prompted by a visit he received 
from the young German mathematician Eisenberg. Among Eisenberg’s 
papers was the observation that matrices form the elements of an alge- 
bra that was much like ordinary arithmetic except that multiplication was 
non-commutative. This was the vital step required to find the quater- 
nion algebra. Hamilton arrived at this algebra on 16 October 1843 while 
out walking with his wife, and carved the equations in stone on Brougham 
Bridge. His discovery of quaternions is perhaps the best-documented math- 
ematical discovery ever. 


The anticommutative law ij = —ji ensures that the norm of a triplet behaves 
sensibly, and also that multiplication of triplets in a plane behaves in a reasonable 
manner. The same is not true for the general product of triplets, however. 
Consider 


(a + ib + je)(a + ty + jz) = (ax — by — cz) + i(ay + be) 
+ j(az + cx) + 17 (bz — cy). (1.22) 


Setting ij = —ji is no longer sufficient to remove the ij term, so the algebra 
does not close. The only thing for Hamilton to do was to set 17 = k, where k is 
some unknown, and see if it could be removed somehow. While walking along 
the Royal Canal he suddenly realised that if his triplets were instead made up 
of four terms he would be able to close the algebra in a simple, symmetric way. 


1.4 QUATERNIONS 


To understand his discovery, consider 


(a+ib+ jc + kd)(a — ib — jc— kd) 

=a? +b? +c? + d?(—k?) — bd(ik + ki) — cd(jk + kj), (1.28) 
where we have assumed that i? = j? = —1 and ij = —ji. The expected norm of 
the above product is a? + b? + c? + d?, which is obtained by setting k? = —1 and 
ik = —ki and jk = —kj. So what values do we use for jk and ik? These follow 
from the fact that ij = k, which gives 


ik = ilij) = (i) j = -j (1.24) 

and 
kj = (ij)j = —i. (1.25) 

Thus the multiplication rules for quaternions are 

P =j? = k? = -1 (1.26) 

and 
ij=—ji=k, jk=-kj=i, ki=-—ik=j. (1.27) 
These can be summarised neatly as i? = j? = k? = ijk 1. It is a simple 


matter to check that these multiplication laws define a closed algebra. 

Hamilton was so excited by his discovery that the very same day he obtained 
leave to present a paper on the quaternions to the Royal Irish Academy. The 
subsequent history of the quaternions is a fascinating story which has been de- 
scribed by many authors. Some suggested material for further reading is given 
at the end of this chapter. In brief, despite the many advantages of working with 
quaternions, their development was blighted by two major problems. 

The first problem was the status of vectors in the algebra. Hamilton identified 
vectors with pure quaternions, which had a null scalar part. On the surface 
this seems fine — pure quaternions define a three-dimensional vector space. 
Indeed, Hamilton invented the word ‘vector’ precisely for these objects and this 
is the origin of the now traditional use of i, 7 and k for a set of orthonormal 
basis vectors. Furthermore, the full product of two pure quaternions led to the 
definition of the extremely useful cross product (see section 1.5). The problem 
is that the product of two pure vectors does not return a new pure vector, so 
the vector part of the algebra does not close. This means that a number of ideas 
in complex analysis do not extend easily to three dimensions. Some people felt 
that this meant that the full quaternion product was of little use, and that the 
scalar and vector parts of the product should be kept separate. This criticism 
misses the point that the quaternion product is invertible, which does bring many 
advantages. 

The second major difficulty encountered with quaternions was their use in 


9 


INTRODUCTION 


describing rotations. The irony here is that quaternions offer the clearest way 
of handling rotations in three dimensions, once one realises that they provide 
a ‘spin-1/2’ representation of the rotation group. That is, if a is a vector (a 
pure quaternion) and R is a unit quaternion, a new vector is obtained by the 
double-sided transformation law 


a’ = RaR’*, (1.28) 


where the * operation reverses the sign of all three ‘imaginary’ components. A 
consequence of this is that each of the basis quaternions i, j and k generates 
rotations through 7. Hamilton, however, was led astray by the analogy with 
complex numbers and tried to impose a single-sided transformation of the form 
a’ = Ra. This works if the axis of rotation is perpendicular to a, but otherwise 
does not return a pure quaternion. More damagingly, it forces one to interpret 
the basis quaternions as generators of rotations through 7/2, which is simply 
wrong! 

Despite the problems with quaternions, it was clear to many that they were 
a useful mathematical system worthy of study. Tait claimed that quaternions 
‘freed the physicist from the constraints of coordinates and allowed thoughts to 
run in their most natural channels’ — a theme we shall frequently meet in this 
book. Quaternions also found favour with the physicist James Clerk Maxwell, 
who employed them in his development of the theory of electromagnetism. De- 
spite these successes, however, quaternions were weighed down by the increas- 
ingly dogmatic arguments over their interpretation and were eventually displaced 
by the hybrid system of vector algebra promoted by Gibbs. 


1.5 The cross product 


Two of the lasting legacies of the quaternion story are the introduction of the 
idea of a vector, and the cross product between two vectors. Suppose we form 
the product of two pure quaternions a and b, where 


a = ai + a2j +a3k, b = bii + b2j + bgk. (1.29) 
Their product can be written 
ab = —a;bi + c, (1.30) 
where c is the pure quaternion 
c = (a2b3 — azb2)i + (agb1 — aıb3)j + (a1b2 — a2b1)k. (1.31) 
Writing c = cyt + coj + 3k the component relation can be written as 
Ci = Cigna dr, (1.32) 
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where the alternating tensor €;;, is defined by 


1 if ijk is a cylic permutation of 123, 
€ijk = \ —1 if ijk is an anticylic permutation of 123, (1.33) 


0 otherwise. 


We recognise the preceding as defining the cross product of two vectors, aX b. 
This has the following properties: 


(i) axb is perpendicular to the plane defined by a and b; 
(ii) axb has magnitude |a||b| sin(0); 
(iii) the vectors a, b and axb form a right-handed set. 


These properties can alternatively be viewed as defining the cross product, and 
from them the algebraic definition can be recovered. This is achieved by starting 
with a right-handed orthonormal frame {e;}. For these we must have 


e1 Xen = €3 etc. (1.34) 
so that we can write 
e;X €j = €ijkēk- (1.35) 
Expanding out a vector in terms of this basis recovers the formula 


axb = (aiei) X (bje;) 
= aibj(eiXe;) 


= (Cijkaibj)ek. (1.36) 


Hence the geometric definition recovers the algebraic one. 

The cross product quickly proved itself to be invaluable to physicists, dra- 
matically simplifying equations in dynamics and electromagnetism. In the latter 
part of the nineteenth century many physicists, most notably Gibbs, advocated 
abandoning quaternions altogether and just working with the individual scalar 
and cross products. We shall see in later chapters that Gibbs was misguided in 
some of his objections to the quaternion product, but his considerable reputa- 
tion carried the day and by the 1900s quaternions had all but disappeared from 
mainstream physics. 


1.6 The outer product 


The cross product has one major failing — it only exists in three dimensions. In 
two dimensions there is nowhere else to go, whereas in four dimensions the con- 
cept of a vector orthogonal to a pair of vectors is not unique. To see this, consider 
four orthonormal vectors e),...,e4. If we take the pair eı and eg and attempt 
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Figure 1.3 Hermann Gunther Grassmann (1809-1877), born in Stettin, 
Germany (now Szczecin, Poland). A German mathematician and school- 
teacher, Grassmann was the third of his parents’ twelve children and was 
born into a family of scholars. His father studied theology and became a 
minister, before switching to teaching mathematics and physics at the Stet- 
tin Gymnasium. Hermann followed in his father’s footsteps, first studying 
theology, classical languages and literature at Berlin. After returning to 
Stettin in 1830 he turned his attention to mathematics and physics. Grass- 
mann passed the qualifying examination to win a teaching certificate in 
1839. This exam included a written assignment on the tides, for which he 
gave a simplified treatment of Laplace’s work based upon a new geometric 
calculus that he had developed. By 1840 he had decided to concentrate 
on mathematics research. He published the first edition of his geometric 
calculus, the 300 page Lineale Ausdehnungslehre in 1844, the same year 
that Hamilton announced the discovery of the quaternions. His work did 
not achieve the same impact as the quaternions, however, and it was many 
years before his ideas were understood and appreciated by other mathe- 
maticians. Disappointed by this lack of interest, Grassmann turned his 
attention to linguistics and comparative philology, with greater immediate 
impact. He was an expert in Sanskrit and translated the Rig-Veda (1876-— 
1877). He also formulated the linguistic law (named after him) stating 
that in Indo-European bases, successive syllables may not begin with as- 
pirates. He died before he could see his ideas on geometry being adopted 
into mainstream mathematics. 


to find a vector perpendicular to both of these, we see that any combination of 
e3 and e4 will do. 
A suitable generalisation of the idea of the cross product was constructed by 
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g i 


Figure 1.4 The outer product. The outer or wedge product of a and b 
returns a directed area element of area |a||b| sin(@). The orientation of the 
parallelogram is defined by whether the circuit a, b, —a, —b is right-handed 
(anticlockwise) or left-handed (clockwise). Interchanging the order of the 
vectors reverses the orientation and introduces a minus sign in the product. 


the remarkable German mathematician H.G. Grassmann (see figure 1.3). His 
work had its origin in the Barycentrischer Calcul of Mobius. There the author 
introduced expressions like AB for the line connecting the points A and B and 
ABC for the triangle defined by A, B and C. Möbius also introduced the 
crucial idea that the sign of the quantity should change if any two points are 
interchanged. (These oriented segments are now referred to as simplices.) It was 
Grassmann’s leap of genius to realise that expressions like AB could actually be 
viewed as a product between vectors. He thus introduced the outer or exterior 
product which, in modern notation, we write as a A^ b, or ‘a wedge b’. 

The outer product can be defined on any vector space and, geometrically, we 
are not forced to picture these vectors as displacements. Indeed, Grassmann 
was motivated by a projective viewpoint, where the elements of the vector space 
are interpreted as points, and the outer product of two points defines the line 
through the points. For our purposes, however, it is simplest to adopt a pic- 
ture in which vectors represent directed line segments. The outer product then 
provides a means of encoding a plane, without relying on the notion of a vector 
perpendicular to it. The result of the outer product is therefore neither a scalar 
nor a vector. It is a new mathematical entity encoding an oriented plane and is 
called a bivector. It can be visualised as the parallelogram obtained by sweep- 
ing one vector along the other (figure 1.4). Changing the order of the vectors 
reverses the orientation of the plane. The magnitude of a/b is |al||b| sin(@), the 
same as the area of the plane segment swept out by the vectors. 

The outer product of two vectors has the following algebraic properties: 
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(i) 


(ii) 


(iii) 


b+c 


Figure 1.5 A geometric picture of bivector addition. In three dimensions 
any two non-parallel planes share a common line. If this line is denoted a, 
the two planes can be represented by a ^b and a ^c. Bivector addition 
proceeds much like vector addition. The planes are combined at a common 
boundary and the resulting plane is defined by the initial and final edges, 
as opposed to the initial and final points for vector addition. The math- 
ematical statement of this addition rule is the distributivity of the outer 
product over addition. 


The product is antisymmetric: 
anb = —b^a. (1.37) 


This has the geometric interpretation of reversing the orientation of the 
surface defined by a and b. It follows immediately that 


a\a=0, forall vectors a. (1.38) 


Bivectors form a linear space, the same way that vectors do. In two and 
three dimensions the addition of bivectors is easy to visualise. In higher 
dimensions this addition is not always so easy to visualise, because two 
planes need not share a common line. 


The outer product is distributive over addition: 
a^n (b+ c) =a^b+ adc. (1.39) 


This helps to visualise the addition of bivectors which share a common 
line (see figure 1.5). 


While it is convenient to visualise the outer product as a parallelogram, the 
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actual shape of the object is not conveyed by the result of the product. This can 
be seen easily by defining a’ = a+ Xb and forming 


a’ Ab = aAb+ ADA = a ^b. (1.40) 


The same bivector can therefore be generated by many different pairs of vectors. 
In many ways it is better to replace the picture of a directed parallelogram with 
that of a directed circle. The circle defines both the plane and a handedness, 
and its area is equal to the magnitude of the bivector. This therefore conveys 
all of the information one has about the bivector, though it does make bivector 
addition harder to visualise. 


1.6.1 Two dimensions 


The outer product of any two vectors defines a plane, so one has to go to at least 
two dimensions to form an interesting product. Suppose then that {e1, e2} are 
an orthonormal basis for the plane, and introduce the vectors 


a = ayey + Q2€2, b= bey + boo. (1.41) 
The outer product a ^ b contains 


a^b = a1b161 Ae, + a b9e1 Neg + Agb,e9Ae1 + Agb yen Neg 
= (aib2 = ab; )e1 Nea, (1.42) 


which recovers the imaginary part of the product of (1.18). The term therefore 
immediately has the expected magnitude |a| |b| sin(@). The coefficient of e1 A e2 
is positive if a and b have the same orientation as e and eg. The orientation is 
defined by traversing the boundary of the parallelogram defined by the vectors a, 
b, —a, —b (see figure 1.4). By convention, we usually work with a right-handed 
set of reference axes (viewed from above). In this case the coefficient a1b2 — azb1 
will be positive if a and b also form a right-handed pair. 


1.6.2 Three dimensions 


In three dimensions the space of bivectors is also three-dimensional, because each 
bivector can be placed in a one-to-one correspondence with the vector perpen- 
dicular to it. Suppose that {e1, e2,e3} form a right-handed basis (see comments 
below), and the two vectors a and b are expanded in this basis as a = a;e; and 
b = bje;. The bivector a ^b can then be decomposed in terms of an orthonormal 
frame of bivectors by 


a^b = (aiei) A(bje;) 
= (a2b3 = b3a2)e2^e3 + (a3bı = a1b3)e3\e1 
+ (a,b = gb, )e, Nea. (1.43) 
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The components in this frame are therefore the same as those of the cross prod- 
uct. But instead of being the components of a vector perpendicular to a and 8, 
they are the components of the bivector a ^b. It is this distinction which enables 
the outer product to be defined in any dimension. 


1.6.8 Handedness 


We have started to employ the idea of handedness without giving a satisfactory 
definition of it. The only space in which there is an unambiguous definition of 
handedness is three dimensions, as this is the space we inhabit and most of us 
can distinguish our left and right hands. This concept of ‘left’ and ‘right’ is 
a man-made convention adopted to make our life easier, and it extends to the 
concept of a frame in a straightforward way. Suppose that we are presented 
with three orthogonal vectors {e1,¢2,e3}. We align the 3 axis with the thumb 
of our right hand and then close our fist. If the direction in which our fist closes 
is the same as that formed by rotating from the 1 to the 2 axis, the frame is 
right-handed. If not, it is left-handed. 

Swapping any pair of vectors swaps the handedness of a frame. Performing two 
such swaps returns us to the original handedness. In three dimensions this corre- 
sponds to a cyclic reordering, and ensures that the frames {e1, €2,e3}, {e3, €1, €2} 
and {e2,e3,e1} all have the same orientation. 

There is no agreed definition of a ‘right-handed’ orientation in spaces of di- 
mensions other than three. All one can do is to make sure that any convention 
used is adopted consistently. In all dimensions the orientation of a set of vec- 
tors is changed if any two vectors are swapped. In two dimensions one does 
still tend to talk about right-handed axes, though the definition is dependent 
on the idea of looking down on the plane from above. The idea of above and 
below is not a feature of the plane itself, but depends on how we embed it in our 
three-dimensional world. There is no definition of left or right-handed which is 
intrinsic to the plane. 


1.6.4 Extending the outer product 


The preceding examples demonstrate that in arbitrary dimensions the compo- 
nents of a^b are given by 


(aAb);; = aris] (1.44) 
where the [] denotes antisymmetrisation. Grassmann was able to take this idea 
further by defining an outer product for any number of vectors. The idea is a 


simple extension of the preceding formula. Expressed in an orthonormal frame, 
the components of the outer product on n vectors are the totally antisymmetrised 
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products of the components of each vector. This definition has the useful prop- 
erty that the outer product is associative, 


aN\(bAc) = (aAb)Ac. (1.45) 
For example, in three dimensions we have 
aA\bAc= (aiei) A(bje;) A (Chex) = EijkaibjCker Ne2^63, (1.46) 


which represents a directed volume (see section 2.4). 

A further feature of the antisymmetry of the product is that the outer product 
of any set of linearly dependent vectors vanishes. This means that statements like 
‘this vector lies on a given plane’, or ‘these two hypersurfaces share a common 
line’ can be encoded algebraically in a simple manner. Equipped with these 
ideas, Grassmann was able to construct a system capable of handling geometric 
concepts in arbitrary dimensions. 

Despite Grassmann’s considerable achievement, the book describing his ideas, 
his Lineale Ausdehnungslehre, did not have any immediate impact. This was 
no doubt due largely to his relative lack of reputation (he was still a German 
schoolteacher when he wrote this work). It was over twenty years before anyone 
of note referred to Grassmann’s work, and during this time Grassmann produced 
a second, extended version of the Ausdehnungslehre. In the latter part of the 
nineteenth century Grassmann’s work started to influence leading figures like 
Gibbs and Clifford. Gibbs wrote a number of papers praising Grassmann’s work 
and contrasting it favourably with the quaternion algebra. Clifford used Grass- 
mann’s work as the starting point for the development of his geometric algebra, 
the subject of this book. 

Today, Grassmann’s ideas are recognised as the first presentation of the ab- 
stract theory of vector spaces over the field of real numbers. Since his death, his 
work has given rise to the influential and fashionable areas of differential forms 
and Grassmann variables. The latter are anticommuting variables and are fun- 
damental to the foundations of much of modern supersymmetry and superstring 
theory. 


1.7 Notes 


Descriptions of linear algebra and vector spaces can be found in most intro- 
ductory textbooks of mathematics, as can discussions of the scalar and cross 
products and complex arithmetic. Quaternions, on the other hand, are much less 
likely to be mentioned. There is a large specialised literature on the quaternions, 
and a good starting point are the works of Altmann (1986, 1989). Altmann’s 
paper on ‘Hamilton, Rodriques and the quaternion scandal’ (1989) is also a good 
introduction to the history of the subject. 

The outer product is covered in most modern textbooks on geometry and 
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physics, such as those by Nakahara (1990), Schutz (1980), and Gockeler & 
Schucker (1987). In most of these works, however, the exterior product is only 
treated in the context of differential forms. Applications to wider topics in geom- 
etry have been discussed by Hestenes (1991) and others. A useful summary in 
provided in the proceedings of the conference Hermann Gunther Grassmann 
(1809-1877), edited by Schubring (1996). Grassmann’s Lineale Ausdehnun- 
gslehre is also finally available in English translation due to Kannenberg (1995). 

For those with a deeper interest in the history of mathematics and the develop- 
ment of vector algebra a good starting point is the set of books by Kline (1972). 
There are also biographies available of many of the key protagonists. Perhaps 
even more interesting is to return to their original papers and experience first 
hand the robust and often humorous language employed at the time. The col- 
lected works of J.W. Gibbs (1906) are particularly entertaining and enlightening, 
and contain a good deal of valuable historical information. 


1.8 Exercises 


1.1 Suppose that the two sets {a1,..., am} and {b1,..., bn} form bases for 
the same vector space, and suppose initially that m > n. By establishing 
a contradiction, prove the basis theorem that all bases of a vector space 
have the same number of elements. 
1.2 Demonstrate that the following define vector spaces: 
(a) the set of all polynomials of degree less than or equal to n; 
(b) all solutions of a given linear homogeneous ordinary differential 
equation; 
(c) the set of all n x m matrices. 


1.3 Prove that in Euclidean space |a + b| < |a| + |b|. When does equality 


hold? 
1.4 Show that the unit quaternions {+1, +i, +j + k} form a discrete group. 
1.5 The unit quaternions i,j,k are generators of rotations about their re- 
spective axes. Are rotations through either m or 7/2 consistent with the 
equation ijk = —1? 


1.6 Prove the following: 
(a) a-(bxXc) = b-(cxa) = c (axb); 
(b) ax(bxc) =a-cb-— a-bc; 
(c) |axb| = ja] |b] sin(0), where a-b = |a| |b| cos(0). 
1.7 Prove that the dimension of the space formed by the exterior product 
of m vectors drawn from a space of dimension n is 


n(n— 1): (n=m+ 1) n! 


1-2. m (n= m)!'m! 
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1.8 


1.9 


Prove that the n-fold exterior product of a set of n dependent vectors is 
Zero. 

A convex polygon in a plane is specified by the ordered set of points 
{£0, £1, ..-, £n}. Prove that the directed area of the polygon is given by 


A = 4(£0Ax1 + 21At2 +- + 2nA20). 


What is the significance of the sign? Can you extend the idea to a 
triangulated surface in three dimensions? 
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Geometric algebra in two and 
three dimensions 


Geometric algebra was introduced in the nineteenth century by the English math- 
ematician William Kingdon Clifford (figure 2.1). Clifford appears to have been 
one of the small number of mathematicians at the time to be significantly in- 
fluenced by Grassmann’s work. Clifford introduced his geometric algebra by 
uniting the inner and outer products into a single geometric product. This is 
associative, like Grassmann’s product, but has the crucial extra feature of being 
invertible, like Hamilton’s quaternion algebra. Indeed, Clifford’s original moti- 
vation was to unite Grassmann’s and Hamilton’s work into a single structure. 
In the mathematical literature one often sees this subject referred to as Clifford 
algebra. We have chosen to follow the example of David Hestenes, and many 
other modern researchers, by returning to Clifford’s original choice of name — 
geometric algebra. One reason for this is that the first published definition of 
the geometric product was due to Grassmann, who introduced it in the second 
Ausdehnungslehre. It was Clifford, however, who realised the great potential of 
this product and who was responsible for advancing the subject. 

In this chapter we introduce the basics of geometric algebra in two and three 
dimensions in a way that is intended to appear natural and geometric, if some- 
what informal. A more formal, axiomatic approach is delayed until chapter 4, 
where geometric algebra is defined in arbitrary dimensions. The meaning of the 
various terms in the algebra we define will be illustrated with familiar examples 
from geometry. In so doing we will also uncover how Hamilton’s quaternions 
fit into geometric algebra, and understand where it was that Hamilton and his 
followers went wrong in their treatment of three-dimensional geometry. One of 
the most powerful applications of geometric algebra is to rotations, and these 
are considered in some detail in this chapter. It is well known that rotations in 
a plane can be efficiently handled with complex numbers. We will see how to 
extend this idea to rotations in three-dimensional space. This representation has 
many applications in classical and quantum physics. 
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Figure 2.1 William Kingdon Clifford 1845-1879. Born in Exeter on 4 May 
1845, his father was a justice of the peace and his mother died early in his 
life. After school he went to King’s College, London and then obtained 
a scholarship to Trinity College, Cambridge, where he followed the likes 
of Thomson and Maxwell in becoming Second Wrangler. There he also 
achieved a reputation as a daring athlete, despite his slight frame. He was 
recommended for a fellowship at Trinity College by Maxwell, and in 1871 
took the Professorship of Applied Mathematics at University College, Lon- 
don. He was made a Fellow of the Royal Society at the extremely young 
age of 29. He married Lucy in 1875, and their house became a fashion- 
able meeting place for scientists and philosophers. As well as being one of 
the foremost mathematicians of his day, he was an accomplished linguist, 
philosopher and author of children’s stories. Sadly, his insatiable appetite 
for physical and mental exercise was not matched by his physique, and in 
1878 he was instructed to stop work and leave England for the Mediter- 
ranean. He returned briefly, only for his health to deteriorate further in 
the English climate. He left for Madeira, where he died on 3 March 1879 
at the age of just 33. Further details of his life can be found in the book 
Such Silver Currents (Chisholm, 2002). Portrait by John Collier (©The 
Royal Society). 


2.1 A new product for vectors 


In chapter 1 we studied various products for vectors, including the symmetric 
scalar (or inner) product and the antisymmetric exterior (or outer) product. In 
two dimensions, we showed how to interpret the result of the complex product 
zw* (section 1.3). The scalar term is the inner product of the two vectors rep- 
resenting the points in the complex plane, and the imaginary term records their 
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directed area. Furthermore, the scalar term is symmetric, and the imaginary 
term is antisymmetric in the two arguments. Clifford’s powerful idea was to 
generalise this product to arbitrary dimensions by replacing the imaginary term 
with the outer product. The result is the geometric product and is written simply 
as ab. The result is the sum of a scalar and a bivector, so 


ab = a-b + aNb. (2.1) 


This sum of two distinct objects — a scalar and a bivector — looks strange at 
first and goes against the rule that one should only add like objects. This is the 
feature of geometric algebra that initially causes the greatest difficulty, in much 
the same way that i? = —1 initially unsettles most school children. So how is 
the sum on the right-hand side of equation (2.1) to be viewed? The answer is 
that it should be viewed in precisely the same way as the addition of a real and 
an imaginary number. The result is neither purely real nor purely imaginary 
— it is a mixture of two different objects which are combined to form a single 
complex number. Similarly, the addition of a scalar to a bivector enables us 
to keep track of the separate components of the product ab. The advantages of 
this are precisely the same as the advantages of complex arithmetic over working 
with the separate real and imaginary parts. This analogy between multivectors in 
geometric algebra and complex numbers is more than a mere pedagogical device. 
As we shall discover, geometric algebra encompasses both complex numbers and 
quaternions. Indeed, Clifford’s achievement was to generalise complex arithmetic 
to spaces of arbitrary dimensions. 

From the symmetry and antisymmetry of the terms on the right-hand side of 
equation (2.1) we see that 


ba = b-a + b^a = a-b — a ^b. (2.2) 
It follows that 
a-b = $(ab + ba) (2.3) 
and 
a\b = $(ab— ba). (2.4) 


We can thus define the inner and outer products in terms of the geometric 
product. This forms the starting point for an axiomatic development of geometric 
algebra, which is presented in chapter 4. 

If we form the product of a and the parallel vector Aa we obtain 


a(Aa) = Aa-a + AaAa = Aa-a, (2.5) 


2 


which is therefore a pure scalar. It follows similarly that a* is a scalar, so we 


can write a? = |a|? for the square of the length of a vector. If instead a and b 
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are perpendicular vectors, their product is 
ab = a:b + aAb= a ^b (2.6) 
and so is a pure bivector. We also see that 


ba = b-a + b^a = —a ^b = —ab, (2.7) 


which shows us that orthogonal vectors anticommute. The geometric product 
between general vectors encodes the relative contributions of both their parallel 
and perpendicular components, summarising these in the separate scalar and 
bivector terms. 


2.2 An outline of geometric algebra 


Clifford went further than just allowing scalars to be added to bivectors. He 
defined an algebra in which elements of any type could be added or multiplied 
together. This is what he called a geometric algebra. Elements of a geometric 
algebra are called multivectors and these form a linear space — scalars can be 
added to bivectors, and vectors, etc. Geometric algebra is a graded algebra, and 
elements of the algebra can be broken up into terms of different grade. The scalar 
objects are assigned grade-0, the vectors grade-1, the bivectors grade-2 and so 
on. Essentially, the grade of the object is the dimension of the hyperplane it 
specifies. The term ‘grade’ is preferred to ‘dimension’, however, as the latter is 
regularly employed for the size of a linear space. We denote the operation of 
projecting onto the terms of a chosen grade by ( ),, so (ab)z denotes the grade-2 
(bivector) part of the geometric product ab. That is, 


(ab)2 = aNb. (2.8) 
The subscript 0 on the scalar term is usually suppressed, so we also have 
(ab)o = (ab) = a-b. (2.9) 


Arbitrary multivectors can also be multiplied together with the geometric 
product. To do this we first extend the geometric product of two vectors to an 
arbitrary number of vectors. This is achieved with the additional rule that the 
geometric product is associative: 


a(bc) = (ab)c = abc. (2.10) 


The associativity property enables us to remove the brackets and write the prod- 
uct as abc. Arbitrary multivectors can now be written as sums of products of 
vectors. The geometric product of multivectors therefore inherits the two main 
properties of the product for vectors, which is to say it is associative: 


A(BC) = (AB)C = ABC, (2.11) 
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and distributive over addition: 
A(B+C) = AB + AC. (2.12) 


Here A, B,...,C denote multivectors containing terms of arbitrary grade. 

The associativity property ensures that it is now possible to divide by vectors, 
thus realising Hamilton’s goal. Suppose that we know that ab = C, where C is 
some combination of a scalar and bivector. We find that 


Cb = (ab)b = a(bb) = ab’, (2.13) 


so we can define b~' = b/b?, and recover a from 
a = Cbt. (2.14) 


This ability to divide by vectors gives the algebra considerable power. 
As an example of these axioms in action, consider forming the square of the 
bivector a^b. The properties of the geometric product allow us to write 


(aAb)(aAb) = (ab — a-b)(a-b — ba) 
= —ab?a — (a-b)? + a-b(ab + ba) 
= (a-b)? — a?b? 
= —a"b? sin*(0), (2.15) 


where we have assumed that a-b = |a| |b| cos(@). The magnitude of the bivector 
a^b is therefore equal to the area of the parallelogram with sides defined by a 
and b. Manipulations such as these are commonplace in geometric algebra, and 
can provide simplified proofs of a number of useful results. 


2.3 Geometric algebra of the plane 


The easiest way to understand the geometric product is by example, so consider 
a two-dimensional space (a plane) spanned by two orthonormal vectors e; and 
e2. These basis vectors satisfy 


ey? =e = 1, e€1-e9 = 0. (2.16) 


The final entity present in the algebra is the bivector e1 A eg. This is the highest 
grade element in the algebra, since the outer product of a set of dependent vectors 
is always zero. The highest grade element in a given algebra is usually called 
the pseudoscalar, and its grade coincides with the dimension of the underlying 
vector space. 

The full algebra is spanned by the basis set 


1 {e1, eg} ey A eo 


; 2.1 
1 scalar 2 vectors 1 bivector (27) 
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We denote this algebra Gz. Any multivector can be decomposed in this basis, 
and sums and products can be calculated in terms of this basis. For example, 
suppose that the multivectors A and B are given by 


A= Qo + aye, + &2€2 + a3e; ^€2, 
B = Bo + b161 + Boe + B3e1Aee, 
then their sum S = A + B is given by 


S = (ao + Bo) + (a1 + Bi)er + (a2 + b2)e2 + (a3 + 3)e1 Nee. (2.18) 


This result for the addition of multivectors is straightforward and unsurprising. 
Matters become more interesting, however, when we start forming products. 


2.3.1 The bivector and its products 


To study the properties of the bivector e; A e2 we first recall that for orthogonal 
vectors the geometric product is a pure bivector: 


e€1€2 = €1 eo + ey Aen = C1 ^€2, (2.19) 
and that orthogonal vectors anticommute: 
e€2€1 = eo Ae, = —€] Ae2 = —€1€2. (2.20) 


We can now form products in which e1e2 multiplies vectors from the left and the 
right. First from the left we find that 


(e1 ^e2)e1 = (—e2e1)e1 = —@€2€1€1 = —€2 (2.21) 
and 
(e1 Nez)ez = (e1e2)e2 = €1€9€2 = €]. (2.22) 


If we assume that e and eg form a right-handed pair, we see that left-multipli- 
cation by the bivector rotates vectors 90° clockwise (i.e. in a negative sense). 
Similarly, acting from the right 


e1(e1€2) = €2, e2(e1€2) = — €]. (2.23) 


So right multiplication rotates 90° anticlockwise — a positive sense. 
The final product in the algebra to consider is the square of the bivector e1 ^ez: 


(e1122)? = €1€9€1€2 = —€]€1€2€2 = —l. (2.24) 


Geometric considerations have led naturally to a quantity which squares to —1. 
This fits with the fact that two successive left (or right) multiplications of a vector 
by e1€2 rotates the vector through 180°, which is equivalent to multiplying by —1. 
The fact that we now have a firm geometric picture for objects whose algebraic 
square is —1 opens up the possibility of providing a geometric interpretation for 
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the unit imaginary employed throughout physics, a theme which will be explored 
further in this book. 


2.3.2 Multiplying multivectors 


Now that all of the individual products have been found, we can compute the 
product of the two general multivectors A and B of equation (2.18), 
AB = M = wo + H161 + poe2 + H3€1€2, (2.25) 
where 
Ho = aobo + 011 + a2b2 — a3 (3, 
Hı = ag F1 + 01 + A382 — azb, 
l2 = a9 82 + A239 + A183 — 0391, 


u3 = A903 + a3 89 + 0182 — a2 fr. 


(2.26) 


The full product shown here is actually rarely used, but writing it out explicitly 
does emphasise some of its key features. The product is always well defined, 
and the algebra is closed under it. Indeed, the product could easily be made an 
intrinsic part of a computer language, in the same way that complex arithmetic 
is already intrinsic to some languages. The basis vectors can also be represented 
with matrices, for example 


PAE À B= . S) (2.27) 


(Verifying that these satisfy the required algebraic relations is left as an exercise.) 
Geometric algebras in general are associative algebras, so it is always possible 
to construct a matrix representation for them. The problem with this is that 
the matrices hide the geometric content of the elements they represent. Much of 
the mathematical literature does focus on matrix representations, and for this 
work the term Clifford algebra is appropriate. For the applications in this book, 
however, the underlying geometry is the important feature of the algebra and 
matrix representations are usually redundant. Geometric algebra is a much more 
appropriate name for this subject. 


2.3.3 Connection with complex numbers 


It is clear that there is a close relationship between geometric algebra in two 
dimensions and the algebra of complex numbers. The unit bivector squares to 
—1 and generates rotations through 90°. The combination of a scalar and a 
bivector, which is formed naturally via the geometric product, can therefore be 
viewed as a complex number. We write this as 


Z = u + veeg = u + Iv, (2.28) 
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Figure 2.2 The Argand diagram. The complex number Z = u + iv repre- 
sents a vector in the complex plane, with Cartesian components u and v. 
The polar decomposition into |Z| exp(i@) can alternatively be viewed as an 
instruction to rotate 1 through 6 and dilate by |Z]. 


where 
IT=e,Ae, P = -1. (2.29) 


Throughout we employ the symbol I for the pseudoscalar of the algebra of in- 
terest. That is why we have used it here, rather than the tempting alternative 
i. The latter is seen often in the literature, but the 7 symbol has the problem of 
suggesting an element which commutes with all others, which is not necessarily 
a property of the pseudoscalar. 

Complex numbers serve a dual purpose in two dimensions. They generate 
rotations and dilations through their polar decomposition |Z|exp(i@), and they 
also represent vectors as points on the Argand diagram (see figure 2.2). But 
in the geometric algebra G2 complex numbers are replaced by scalar + bivector 
combinations, whereas vectors are grade-1 objects, 


x= uey + veo. (2.30) 


Is there a natural map between x and the multivector Z? The answer is simple 
— pre-multiply by e1, 


e1£ = u + veie = u + lv = Z. (2.31) 


That is all there is to it! The role of the preferred vector e; is clear — it is 
the real axis. Using this product vectors in a plane can be interchanged with 
complex numbers in a natural manner. 

If we now consider the complex conjugate of Z, Z? = u — iv, we see that 


Zi = u + vege, = ze1, (2.32) 
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which has simply reversed the order of the geometric product of x and e1. This 
operation of reversing the order of products is one of the fundamental operations 
performed in geometric algebra, and is called reversion (see section 2.5). Suppose 
now that we introduce a second complex number W, with vector equivalent y: 


W = ey. (2.33) 
The complex product ZWt = WtZ now becomes 
WIZ = yeei = yz, (2.34) 


which returns the geometric product yx. This is as expected, as the complex 
product was used to suggest the form of the geometric product. 


2.3.4 Rotations 


Since we know how to rotate complex numbers, we can use this to find a formula 
for rotating vectors in a plane. We know that a positive rotation through an 
angle @ for a complex number Z is achieved by 


Z= Zl = tZ, (2.35) 


where i is the standard unit imaginary (see figure 2.3). Again, we now view Z 
as a combination of a scalar and a pseudoscalar in Gz and so replace i with I. 
The exponential of Jọ is defined by power series in the normal way, so we still 
have 


ef? = 3 e =cosġ + Í sing. (2.36) 
n=0 
Suppose that Z’ has the vector equivalent 2’, 
rt = aZ. (2.37) 
We now have a means of rotating the vector directly by writing 
a’ = ee!?Z = ee Pez. (2.38) 
But 


eye! %e, = eı(cos ġ + I sin d)e, 
= cos ġ — Í sin ġ = ee. (2.39) 


where we have employed the result that I anticommutes with vectors. We there- 
fore arrive at the formulae 


a! = e?r = gel?, (2.40) 


which achieve a rotation of the vector x in the J plane, through an angle ¢. 
In section 2.7 we show how to extend this idea to arbitrary dimensions. The 
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Ql 
Z! = re? 


Figure 2.3 A rotation in the complex plane. The complex number Z is 
multiplied by the phase term exp(I¢), the effect of which is to replace 6 by 
0” =0+¢ġ. 


change of sign in the exponential acting from the left and right of the vector x 
is to be expected. We saw earlier that left-multiplication by I generated left- 
handed rotations, and right-multiplication generated right-handed rotations. As 
the overall rotation is right-handed, the sign of J must be negative when acting 
from the left. 

This should illustrate that geometric algebra fully encompasses complex arith- 
metic, and we will see later that complex analysis is fully incorporated as well. 
The beauty of the geometric algebra formulation is that it shows immediately 
how to extend the ideas of complex analysis to higher dimensions, a problem 
which had troubled mathematicians for many years. The key to this is the 
separation of the two roles of complex numbers by treating vectors as grade-1 
objects, and the quantities acting on them (the complex numbers) as combina- 
tions of grade-0 and grade-2 objects. These two roles generalise differently in 
higher dimensions and, once one sees this, extending complex analysis becomes 
straightforward. 


2.4 The geometric algebra of space 


The geometric algebra of three-dimensional space is a remarkably powerful tool 
for solving problems in geometry and classical mechanics. It describes vectors, 
planes and volumes in a single algebra, which contains all of the familiar vec- 
tor operations. These include the vector cross product, which is revealed as a 
disguised form of bivector. The algebra also provides a very clear and com- 
pact method for encoding rotations, which is considerably more powerful than 
working with matrices. 

We have so far constructed the geometric algebra of a plane. We now add a 
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third vector e3 to our two-dimensional set {e1, e2}. All three vectors are assumed 
to be orthonormal, so they all anticommute. From these three basis vectors we 
generate the independent bivectors 


{e1e2, €2€3, e3e1 }. 


This is the expected number of independent planes in space. There is one further 
term to consider, which is the product of all three vectors: 


(e1e2)e3 = e1€2€3. (2.41) 


This results in a grade-3 object, called a trivector. It corresponds to sweeping 
the bivector e; ^ez along the vector e3, resulting in a three-dimensional volume 
element (see section 2.4.3). The trivector represents the unique volume element 
in three dimensions. It is the highest grade element and is unique up to scale 
(or volume) and handedness (sign). This is again called the pseudoscalar for the 
algebra. 
In three dimensions there are no further directions to add, so the algebra is 
spanned by 
1 {e;} {e:Ae;} €1€2€3 


2.42 
1 scalar 3 vectors 3 bivectors 1 trivector ( ) 


This basis defines a graded linear space of total dimension 8 = 23. We call 
this algebra G3. Notice that the dimensions of each subspace are given by the 
binomial coefficients. 


2.4.1 Products of vectors and bivectors 


Our expanded algebra gives us a number of new products to consider. We start 
by considering the product of a vector and a bivector. We have already looked 
at this in two dimensions, and found that a normalised bivector rotates vectors 
in its plane by 90°. Each of the basis bivectors in equation (2.42) shares the 
properties of the single bivector studied previously for two dimensions. So 


(e1€2)” = (e2€3)” = (e3e1)” = —1 (2.43) 


and each bivector generates 90° rotations in its own plane. 

The geometric product for vectors extends to all objects in the algebra, so we 
can form expressions such as aB, where a is a vector and B is a bivector. Now 
that our algebra contains a trivector e;(e2/\e3), we see that the result of the 
product aB can contain both vector and trivector terms, the latter arising if a 
does not lie fully in the B plane. To understand the properties of the product 
aB we first decompose a into terms in and out of the plane, 


a=a)+ a1, (2.44) 
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aL 


a| 


Figure 2.4 A vector and a bivector. The vector a can be written as the 
sum of a term in the plane B and a term perpendicular to the plane, so 
that a = a; + a1. The bivector B can be written as aj Ab, where b is 
perpendicular to a). 


as shown in figure 2.4. We can now write aB = (a + a,)B. Suppose that we 
also write 


B= ay Ab = ajb, (2.45) 


where b is orthogonal to aq in the B plane. It is always possible to find such a 
vector b. We now see that 


ay B = ay(ayb) = ay7b (2.46) 


and so is a vector. This is clear in that the product of a plane with a vector in 
the plane must remain in the plane. On the other hand 


a, B= a (aj^b) = a,ayb, (2.47) 


which is the product of three orthogonal (anticommuting) vectors and so is a 
trivector. As expected, the product of a vector and a bivector will in general 
contain vector and trivector terms. 

To explore this further let us form the product of the vector a with the bivector 
b Ac. From the associative and distributive properties of the geometric product 
we have 


a(bAc) = aż (bc — cb) = $(abe — acb). (2.48) 
We now use the rearrangement 
ab = 2a-b — ba (2.49) 
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to write 
a(bAc) = (a-b)c — (a-c)b — 4 (bac — cab) 
= 2(a-b)c — 2(a-c)b + $(be — cb)a, (2.50) 
so that 
a(bAc) — (bAc)a = 2(a-b)c — 2(a-c)b. (2.51) 


The right-hand side of this equation is a vector, so the antisymmetrised product 
of a vector with a bivector is another vector. Since this operation is grade- 
lowering, we give it the dot symbol again and write 


a-B = ¿(aB — Ba), (2.52) 


where B is an arbitrary bivector. The preceding rearrangement means that we 
have proved one of the most useful results in geometric algebra, 


a:(bAc) =a-bc—a-cb. (2.53) 
Returning to equation (2.46) we see that we must have 
a-B = aB = aj: B. (2.54) 


So the effect of taking the inner product of a vector with a bivector is to project 
onto the component of the vector in the plane, and then rotate this through 90° 
and dilate by the magnitude of B. We can also confirm that 


a- B = ay*b = —(ayb)ay = —B-a, (2.55) 


as expected. 
The remaining part of the product of a vector and a bivector returns a grade-3 
trivector. This product is denoted with a wedge since it is grade-raising, so 


a\(bAc) = $(a(bAc) + (bAc)a). (2.56) 
A few lines of algebra confirm that this outer product is associative, 
a\(bAc) = $(a(bAc) + (bAc)a) 
= 1 (abe — acb + bca — cba) 
$+ (2(anb)c + bac + bca + 2c(a^b) — cab — acb) 
= 4 ((anb)c + c(a^b) + b(c-a) — (c-a)b) 
= (a^b) ^c, (2.57) 
so we can unambiguously write the result as a ^b ^c. The product aA bAc 


is therefore associative and antisymmetric on all pairs of vectors, and so is pre- 
cisely Grassmann’s exterior product (see section 1.6). This demonstrates that 
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Grassmann’s exterior product sits naturally within geometric algebra. From 
equation (2.47) we have 


a\B=a,B=a,AB, (2.58) 


so the effect of the exterior product with a bivector is to project onto the com- 
ponent of the vector perpendicular to the plane, and return a volume element (a 
trivector). We can confirm simply that this product is symmetric in its vector 
and bivector arguments: 


a\ B = a, Aa Ab = —a) Aa, Ab = ay AbAa, = Baa. (2.59) 
The full product of a vector and a bivector can now be written as 
aB=a-B+a\B, (2.60) 


where the dot is generalised to mean the lowest grade part of the product, while 
the wedge means the highest grade part of the product. In a similar manner to 
the geometric product of vectors, the separate dot and wedge products can be 


written in terms of the geometric product as 
a-B = }(aB — Ba), 
al ) (2.61) 
a\B = 5(aB+ Ba). 


But pay close attention to the signs in these formulae, which are the opposite 
way round to the case of two vectors. The full product of a vector and a bivector 
wraps up the separate vector and trivector terms in the single product aB. The 
advantage of this is again that the full product is invertible. 


2.4.2 The bivector algebra 


Our three independent bivectors also give us another new product to consider. 
We already know that squaring a bivector results in a scalar. But if we multiply 
together two bivectors representing orthogonal planes we find that, for example, 


(e1^e2)(e€2^63) = €@1€2€2€3 = €1€3, (2.62) 
resulting in a third bivector. We also find that 
(e2^e3)(e1^e2) = €@3€2€2€1 = €3€1 = —€1€3, (2.63) 


so the product of orthogonal bivectors is antisymmetric. The symmetric contri- 
bution vanishes because the two planes are perpendicular. 
If we introduce the following labelling for the basis bivectors: 


Bı =e2e3, By = e3e1, Bg = e1e2, (2.64) 
we find that their product satisfies 
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There is a clear analogy with the geometric product of vectors here, in that the 
symmetric part is a scalar, whereas the antisymmetric part is a bivector. In 
higher dimensions it turns out that the symmetrised product of two bivectors 
can have grade-0 and grade-4 terms (which we will ultimately denote with the 
dot and wedge symbols). The antisymmetrised product is always a bivector, and 
bivectors form a closed algebra under this product. 

The basis bivectors satisfy 


BY? = B? = B} =-1 (2.66) 
and 


Bı Bo = — B-B, etc. (2.67) 


These are the properties of the generators of the quaternion algebra (see sec- 
tion 1.4). This observation helps to sort out some of the problems encountered 
with the quaternions. Hamilton attempted to identify pure quaternions (null 
scalar part) with vectors, but we now see that they are actually bivectors. This 
causes problems when looking at how objects transform under reflections. Hamil- 
ton also imposed the condition ijk = —1 on his unit quaternions, whereas we 
have 


Bı B2 B3 = €@2€3€3€1€1€2 = +1. (2.68) 


To set up an isomorphism we must flip a sign somewhere, for example in the y 
component: 


Lo By, j <2: —Bo, ko B3. (2.69) 


This shows us that the quaternions are a left-handed set of bivectors, whereas 
Hamilton and others attempted to view the i, j, k as a right-handed set of vectors. 
Not surprisingly, this was a potential source of great confusion and meant one 
had to be extremely careful when applying quaternions in vector algebra. 


2.4.8 The trivector 


Given three vectors, a, b and c, the trivector a ^ b/c is formed by sweeping a/b 
along the vector c (see figure 2.5). The result can be represented pictorially as 
an oriented parallelepiped. As with bivectors, however, the picture should not 
be interpreted too literally. The trivector a A b A c does not contain any shape 
information. It just records a volume and an orientation. 

The various algebraic properties of trivectors have straightforward geometric 
interpretations. The same oriented volume is obtained by sweeping a ^b along c 
or b A c along a. The mathematical expression of this is that the outer product 
is associative, a A (b A c) = (a ^b) Ac. The trivector a ^b A^ c changes sign 
under interchange of any pair of vectors, which follows immediately from the 
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Figure 2.5 The trivector. The trivector a ^b A^ c can be viewed as the 
oriented parallelepiped obtained from sweeping the bivector a/b along the 
vector c. In the left-hand diagram the bivector a A b is swept along c. In 
the right-hand one b A^ c is swept along a. The result is the same in both 
cases, demonstrating the equality a\bAc=bAcAa. The associativity of 
the outer product is also clear from such diagrams. 


antisymmetry of the exterior product. The geometric picture of this is that 
swapping any two vectors reverses the orientation by which the volume is swept 
out. Under two successive interchanges of pairs of vectors the trivector returns 
to itself, so 


aNbAc = cAaAb = bAcAa. (2.70) 


This is also illustrated in figure 2.5. 
The unit right-handed pseudoscalar for space is given the standard symbol J, 
so 


I= €1€2€3, (2.71) 
where the {e1,€2,€3} are any right-handed frame of orthonormal vectors. If a 


left-handed set of orthonormal vectors is multiplied together the result is —J. 
Given an arbitrary set of three vectors we must have 


a^nb^c =a], (2.72) 


where a is a scalar. It is not hard to show that |a| is the volume of the paral- 
lelepiped with sides defined by a, b and c. The sign of œ encodes whether the 
set {a, b,c} forms a right-handed or left-handed frame. In three dimensions this 
fully accounts for the information in the trivector. 

Now consider the product of the vector e, and the pseudoscalar, 


el = e1 (e€1€2€3) = €2€3. (2.73) 


This returns a bivector — the plane perpendicular to the original vector (see 
figure 2.6). The product of a grade-1 vector with the grade-3 pseudoscalar is 
therefore a grade-2 bivector. Multiplying from the left we find that 


Te; = €@1€2€3€1 = —€1€2€1€3 = €2€3. (2.74) 
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€3 


el 


Figure 2.6 A vector and a trivector. The result of multiplying the vector 
e, by the trivector I is the plane e1(e1e2e3) = e2e3. This is the plane 
perpendicular to the e; vector. 


The result is therefore independent of order, and this holds for any basis vector. 
It follows that the pseudoscalar commutes with all vectors in three dimensions: 


Ta=al. (2.75) 


This is always the case for the pseudoscalar in spaces of odd dimension. In even 
dimensions, the pseudoscalar anticommutes with all vectors, as we have already 
seen in two dimensions. 

We can now express each of our basis bivectors as the product of the pseudoscalar 
and a dual vector: 


€]€2 = Tes, e9e3 = Tey, e3e; = Teg. (2.76) 


This operation of multiplying by the pseudoscalar is called a duality transforma- 
tion and was originally introduced by Grassmann. Again, we can write 


al=a-I (2.77) 


with the dot used to denote the lowest grade term in the product. The result 
of this can be understood as a projection — projecting onto the component of I 
perpendicular to a. 

We next form the square of the pseudoscalar: 


I = €@]1€2€3€1€2€3 = €1€2€1€2 = —1. (2.78) 


So the pseudoscalar commutes with all elements and squares to —1. It is therefore 
a further candidate for a unit imaginary. In some physical applications this is the 
correct one to use, whereas for others it is one of the bivectors. The properties of 
I in three dimensions make it particularly tempting to replace it with the symbol 
i, and this is common practice in much of the literature. This convention can 
still lead to confusion, however, and is not adopted in this book. 
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Finally, we consider the product of a bivector and the pseudoscalar: 
I(e1^e2) = Te1e2€363 = ITes = —€3. (2.79) 


So the result of the product of J with the bivector formed from e, and eg is 
—e3, that is, minus the vector perpendicular to the e1 ^e2 plane. This provides 
a definition of the vector cross product as 


axb = —I(a^b). (2.80) 


The vector cross product is largely redundant now that we have the exterior 
product and duality at our disposal. For example, consider the result for the 
double cross product. We form 


ax(bxc) = —IaA(—I(bAc)) 
= $1(al(bAc) — (bAc)Ia) 
= —a-(bAc). (2.81) 


We have already calculated the expansion of the final line, which turns out to 
be the first example of a much more general, and very useful, formula. 

Equation (2.80) shows how the cross product of two vectors is a disguised 
bivector, the bivector being mapped to a vector by a duality operation. It is 
now clear why the product only exists in three dimensions — this is the only 
space for which the dual of a bivector is a vector. We will have little further 
use for the cross product and will rarely employ it from now on. This means we 
can also do away with the awkward distinction between polar and axial vectors. 
Instead we just talk in terms of vectors and bivectors. Both may belong to 
three-dimensional linear spaces, but they are quite different objects with distinct 
algebraic properties. 


2.4.4 The Pauli algebra 
The full geometric product for vectors can be written 
Cpe; = Cie; + e; Ae; = bi + Leijhek- (2.82) 


This may be familiar to many — it is the Pauli algebra of quantum mechan- 
ics! The Pauli matrices therefore form a matrix representation of the geometric 
algebra of space. The Pauli matrices are 


o = : 02 = as o3 = G = (2.83) 


These matrices satisfy 


0,05 = diy! + liiki ks (2.84) 
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where | is the 2 x 2 identity matrix. Historically, these matrices were discovered 
by Pauli in his investigations of the quantum theory of spin. The link with 
geometric algebra (‘Clifford algebra’ in the quantum theory textbooks) was only 
made later. 

Surprisingly, though the link with the geometric algebra of space is now well 
established, one seldom sees the Pauli matrices referred to as a representation 
for the algebra of a set of vectors. Instead they are almost universally referred 
to as the components of a single vector in ‘isospace’. A handful of authors (most 
notably David Hestenes) have pointed out the curious nature of this interpreta- 
tion. Such discussion remains controversial, however, and will only be touched 
on in this book. As with all arguments over interpretations of quantum mechan- 
ics, how one views the Pauli matrices has little effect on the predictions of the 
theory. 

The fact that the Pauli matrices form a matrix representation of G3 provides an 
alternative way of performing multivector manipulations. This method is usually 
slower, but can sometimes be used to advantage, particularly in programming 
languages where complex arithmetic is built in. Working directly with matrices 
does obscure geometric meaning, and is usually best avoided. 


2.5 Conventions 


A number of conventions help to simplify expressions in geometric algebra. For 
example, expressions such as (a- b)c and I(a A b) demonstrate that it would be 
useful to have a convention which allows us to remove the brackets. We thus 
introduce the operator ordering convention that in the absence of brackets, inner 
and outer products are performed before geometric products. This can remove 
significant numbers of unnecessary brackets. For example, we can safely write 


I(aAb) = I anb. (2.85) 
and 

(a-b)c = a-be. (2.86) 
In addition, unless brackets specify otherwise, inner products are performed 

before outer products, 
a:-bcAd = (a-b)cAd. (2.87) 
A simple notation for the result of projecting out the elements of a multivector 
that have a given grade is also invaluable. We denote this with angled brackets 


()r, where r is the grade onto which we want to project. With this notation we 
can write, for example, 


aNb = (aNb)2 = (ab)o. (2.88) 


The final expression holds because a A b is the sole grade-2 component of the 
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geometric product ab. This notation can be extremely useful as it often enables 
inner and outer products to be replaced by geometric products, which are usually 
simpler to manipulate. The operation of taking the scalar part of a product is 
often needed, and it is conventional for this to drop the subscript zero and simply 
write 


(M) = (M)o. (2.89) 
The scalar part of any pair of multivectors is symmetric: 
(AB) = (BA). (2.90) 
It follows that the scalar part satisfies the cyclic reordering property 
(AB.--C) = (B.---CA), (2.91) 


which is frequently employed in manipulations. 

An important operation in geometric algebra is that of reversion, which re- 
verses the order of vectors in any product. There are two conventions for this in 
common usage. One is the dagger symbol, At, used for Hermitian conjugation 
in matrix algebra. The other is to use a tilde, A. In three-dimensional applica- 
tions the dagger symbol is often employed, as the reverse operation returns the 
same result as Hermitian conjugation of the Pauli matrix representation of the 
algebra. In spacetime physics, however, the tilde symbol is the better choice as 
the dagger is reserved for a different (frame-dependent) operation in relativistic 
quantum mechanics. For the remainder of this chapter we will use the dagger 
symbol, as we will concentrate on applications in three dimensions. 

Scalars and vectors are invariant under reversion, but bivectors change sign: 


(e1e2) = ene] = —€1€2. (2.92) 
Similarly, we see that 
Ti = egege; = e1€3e2 = —eene3 = — T. (2.93) 
A general multivector in G3 can be written 
M=a+a+B+ Ql, (2.94) 


where a is a vector, B is a bivector and a and ĝ are scalars. From the above we 
see that the reverse of M, Mt, is 


Mi =a+a-B-Ql. (2.95) 


As stated above, this operation has the same effect as Hermitian conjugation 
applied to the Pauli matrices. 

We have now introduced a number of terms, some of which have overlapping 
meaning. It is useful at this point to refer to multivectors which only contain 
terms of a single grade as homogeneous. The term inner product is reserved for 
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the lowest grade part of the geometric product of two homogeneous multivectors. 
For two homogeneous multivectors of the same grade the inner product and scalar 
product reduce to the same thing. The terms exterior and outer products are 
interchangeable, though we will tend to prefer the latter for its symmetry with 
the inner product. The inner and outer products are also referred to colloquially 
as the dot and wedge products. We have followed convention in referring to 
the highest grade element in a geometric algebra as the pseudoscalar. This is 
a convenient name, though one must be wary that in tensor analysis the term 
can mean something subtly different. Both directed volume element and volume 
form are good alternative names, but we will stick with pseudoscalar in this 
book. 


2.6 Reflections 


The full power of geometric algebra begins to emerge when we consider reflections 
and rotations. We start with an arbitrary vector a and a unit vector n (n? = 1), 
and resolve a into parts parallel and perpendicular to n. This is achieved simply 


by forming 
a=n'a 
=n(n-a+nAa) 
= ay +a, (2.96) 
where 
a= ann, a, =nna. (2.97) 


The formula for a is certainly the projection of a onto n, and the remaining 
term must be the perpendicular component (sometimes called the rejection). We 
can check that a, is perpendicular to n quite simply: 


n-a = (nnnAa) = (n^a) = 0. (2.98) 


This is a simple example of how using the projection onto grade operator to re- 
place inner and outer products with geometric products can simplify derivations. 

The result of reflecting a in the plane orthogonal to n is the vector a’ = a —a 
(see figure 2.7). This can be written 


a’ =a, — aj =nn^a— ann 
=—n-an—n^an 


= —nan. (2.99) 


This formula is already more compact than can be written down without the 
geometric product. The best one can do with just the inner product is the 
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=Q 


Figure 2.7 A reflection. The vector a is reflected in the (hyper)plane per- 
pendicular to n. This is the way to describe reflections in arbitrary dimen- 
sions. The result a’ is formed by reversing the sign of aj, the component 
of a in the n direction. 


equivalent expression 
a’ =a-—2a-nn. (2.100) 


The compression afforded by the geometric product becomes increasingly im- 
pressive as reflections are compounded together. The formula 


a’ = —nan (2.101) 


is valid is spaces of any dimension — it is a quite general formula for a reflection. 

We should check that our formula for the reflection has the desired property 
of leaving lengths and angles unchanged. To do this we need only verify that 
the scalar product between vectors is unchanged if both are reflected, which is 
achieved with a simple rearrangement: 


(—nan)-(—nbn) = ((—nan)(—nbn)) = (nabn) = (abnn) = a-b. (2.102) 


In this manipulation we have made use of the cyclic reordering property of the 
scalar part of a geometric product, as defined in equation (2.91). 


2.6.1 Complex conjugation 


In two dimensions we saw that the vector x is mapped to a complex number Z 
by 


Z=ex, r=eEZ. (2.103) 
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The complex conjugate Zt is the reverse of this, Zt? = xe,, so maps to the vector 
a’ =e Zt = eze]. (2.104) 


This can be converted into the formula for a reflection if we remember that 
the two-dimensional pseudoscalar J = e,e2 anticommutes with all vectors and 
squares to —1. We therefore have 


a’ = —eı l Ixe = —eı Ixe] = —@2T€2. (2.105) 


This is precisely the expected relation for a reflection in the line perpendicular 
to e2, which is to say a reflection in the real axis. 


2.6.2 Reflecting bivectors 


Now suppose that we form the bivector B = a A^ b and reflect both of these 
vectors in the plane perpendicular to n. The result is 


B' = (—nan)A(—nbn). (2.106) 
This simplifies as follows: 


(—nan)A(—nbn) = $(nannbn — nbnnan) 


= 5n(ab— ba)n 


=nBn. (2.107) 


The effect of sandwiching a multivector between a vector, nMn, always preserves 
the grade of the multivector M. We will see how to prove this in general when 
we have derived a few more results for manipulating inner and outer products. 
The resulting formula nBn shows that bivectors are subject to the same trans- 
formation law as vectors, except for a change in sign. This is the origin of the 
conventional distinction between polar and axial vectors. Axial vectors are usu- 
ally generated by the cross product, and we saw in section 2.4.3 that the cross 
product generates a bivector, and then dualises it back to a vector. But when the 
two vectors in the cross product are reflected, the bivector they form is reflected 
according to (2.107). The dual vector IB is subject to the same transformation 
law, since 


I(nBn) = n(IB)n, (2.108) 


and so does not transform as a (polar) vector. In many texts this can be a source 
of much confusion. But now we have a much healthier alternative: banish all 
talk of axial vectors in favour of bivectors. We will see in later chapters that 
all of the main examples of ‘axial’ vectors in physics (angular velocity, angular 
momentum, the magnetic field etc.) are better viewed as bivectors. 
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2.6.8 Trivectors and handedness 


The final object to try reflecting in three dimensions is the trivector aA b A c. 
We first write 


(—nan) A(—nbn)A(—nen) = ((—nan)(—nbn)(—nen))s3 
= —(nabcn)3, (2.109) 


which follows because the only way to form a trivector from the geometric prod- 
uct of three vectors is through the exterior product of all three. Now the product 
abc can only contain a vector and trivector term. The former cannot give rise to 
an overall trivector, so we are left with 


(—nan) A(—nbn) A(—nen) = —(naAbAcn)3. (2.110) 


But any trivector in three dimensions is a multiple of the pseudoscalar I, which 
commutes with all vectors, so we are left with 


(—nan) A(—nbn)A(—nen) = —aAbAc. (2.111) 


The overall effect is simply to flip the sign of the trivector, which is a way of 
stating that reflections have determinant —1. This means that if all three vectors 
in a right-handed triplet are reflected in some plane, the resulting triplet is left 
handed (and vice versa). 


2.7 Rotations 


Our starting point for the treatment of rotations is the result that a rotation 
in the plane generated by two unit vectors m and n is achieved by successive 
reflections in the (hyper)planes perpendicular to m and n. This is illustrated in 
figure 2.8. Any component of a perpendicular to the mAn plane is unaffected, 
and simple trigonometry confirms that the angle between the initial vector a 
and the final vector c is twice the angle between m and n. (The proof of this is 
left as an exercise.) The result of the successive reflections is therefore to rotate 
through 20 in the mAn plane, where m-n = cos(6). 
So how does this look using geometric algebra? We first form 


b = —mam (2.112) 
and then perform a second reflection to obtain 
c= —nbn = —n(—mam)n = nmamn. (2.113) 
This is starting to look extremely simple! We define 
R= nm, (2.114) 
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MAN 


Figure 2.8 A rotation from two reflections. The vector b is the result of 
reflecting a in the plane perpendicular to m, and c is the result of reflecting 
b in the plane perpendicular to n. 


so that we can now write the result of the rotation as 
c= Raki. (2.115) 


This transformation a ++ RaRt is a totally general way of handling rotations. 
In deriving this transformation the dimensionality of the space of vectors was 
never specified, so the transformation law must work in all spaces, whatever their 
dimension. The rule also works for any grade of multivector! 


2.7.1 Rotors 


The quantity R = nm is called a rotor and is one of the most important objects 
in applications of geometric algebra. Immediately, one can see the importance 
of the geometric product in both (2.114) and (2.115), which tells us that rotors 
provide a way of handling rotations that is unique to geometric algebra. To 
study the properties of the rotor R we first write 


R=nm=n-m+nAm = cos(6) +nAm. (2.116) 


44 


2.7 ROTATIONS 


We already calculated the magnitude of the bivector m A n in equation (2.15), 
where we obtained 


(nAm)(nAm) = — sin? (0). (2.117) 
We therefore define the unit bivector B in the mAn plane by 
mAn 
B= B?=-1. 2.11 
sin(6)’ a) 


The reason for this choice of orientation (m A n rather than n A m) is to ensure 
that the rotation has the orientation specified by the generating bivector, as can 
be seen in figure 2.8. In terms of the bivector B we now have 

R = cos(0) — Bsin(6), (2.119) 
which is simply the polar decomposition of a complex number, with the unit 
imaginary replaced by the unit bivector B. We can therefore write 


R = exp(—B6), (2.120) 


with the exponential defined in terms of its power series in the normal way. (The 
power series for the exponential is absolutely convergent for any multivector 
argument. ) 

Now recall that our formula was for a rotation through 26. If we want to 
rotate through 0, the appropriate rotor is 


R =exp(—B6/2), (2.121) 
which gives the formula 
am al =e P9/2 geP0/? (2.122) 


for a rotation through 6 in the B plane, with handedness determined by B (see 
figure 2.9). This description encourages us to think of rotations taking place 
in a plane, and as such gives equations which are valid in any dimension. The 
more traditional idea of rotations taking place around an axis is an entirely 
three-dimensional concept which does not generalise. 

Since the rotor R is a geometric product of two unit vectors, we see immedi- 
ately that 


RR = nm(nm)! = nmmn = 1 = Fİ R. (2.123) 
This provides a quick proof that our formula has the correct property of preserv- 
ing lengths and angles. Suppose that a’ = RaR and b = RbR', then 
a'-b' = 4(RaR' RbR! + RbR' RaR') 
= }R(ab + ba) RÝ 
=a-bRR' 
=a-b. (2.124) 
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Figure 2.9 A rotation in three dimensions. The vector a is rotated to 
a’ = RaR'. The rotor R is defined by R = exp(—B6@/2), which describes 
the rotation directly in terms of the plane and angle. The rotation has the 
orientation specified by the bivector B. 


We can also see that the inverse transformation is given by 
a= Rd R. (2.125) 
The proof is straightforward: 
Ria R= R' RaR R =a. (2.126) 


The usefulness of rotors provides ample justification for adding up terms of 
different grades. The rotor R on its own has no geometric significance, which is 
to say that no meaning should be attached to the separate scalar and bivector 
terms. When R is written in the form R = exp(—B0/2), however, the bivector 
B has clear geometric significance, as does the vector formed from RaRt. This 
illustrates a central feature of geometric algebra, which is that both geometrically 
meaningful objects (vectors, planes etc.) and the elements that act on them (in 
this case rotors) are represented in the same algebra. 


2.7.2 Constructing a rotor 


Suppose that we wish to rotate the unit vector a into another unit vector b, 
leaving all vectors perpendicular to a and b unchanged. This is accomplished by 
a reflection perpendicular to the unit vector n half-way between a and b followed 
by a reflection in the plane perpendicular to 6 (see figure 2.10). The vector n is 
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Figure 2.10 A rotation froma to b. The vector a is rotated onto b by first 
reflecting in the plane perpendicular to n, and then in the plane perpen- 
dicular to b. The vectors a, b and n all have unit length. 


given by 


(a +b) 
la + b|’ 


(2.127) 


which reflects a into —b. Combining this with the reflection in the plane perpen- 
dicular to b we arrive at the rotor 


e e a T (2.128) 


la +b] /2(1 + b-a)’ 
which represents a simple rotation in the a ^b plane. This formula shows us that 


OS go Ee ey (2.129) 


© /2(1+ba) 21 + b-a) 


It follows that we can write 


RaRt = R?a = aR". (2.130) 


This is always possible for vectors in the plane of rotation. Returning to the 
polar form R = exp(—Bé6/2), where B is the a ^ b plane, we see that 


R? = exp(—Bé), (2.131) 
so we can rotate a onto b with the formula 
b=e 8a = aP? (2.132) 


This is precisely the form found in the plane using complex numbers, and was 
the source of much of the confusion over the use of quaternions for rotations. 
Hamilton thought that a single-sided transformation law of the form a +> Ra 
should be the correct way to encode a rotation, with the full angle appearing 
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in the exponential. He thought that this was the natural generalisation of the 
complex number representation. But we can see now that this formula only 
works for vectors in the plane of rotation. The correct formula for all vectors is 
the double-sided, half-angle formula a œ> RaRt. This formula ensures that given 
a vector c perpendicular to the a A b plane we have 


1+b 1+b 
c=c PH ee ue c= Re, (2.133) 
V21 +ba)  /2(1+b-a) 
so that 
RcR' = cRR =c, (2.134) 


and the vector is unrotated. The single-sided law does not have this property. 
Correctly identifying the double-sided transformation law means that unit bivec- 
tors such as 


ejeg = e&16277/2 (2.135) 


are generators of rotations through 7, and not 7/2. The fact that unit bivectors 
square to —1 is consistent with this because, acting double sidedly, the rotor —1 
is the identity operation. More generally, R and —R generate the same rotation, 
so there is a two-to-one map between rotors and rotations. (Mathematicians talk 
of the rotors providing a double-cover representation of the rotation group.) 


2.7.3 Rotating multivectors 


Suppose that the two vectors forming the bivector B = a/b are both rotated. 
What is the expression for the resulting bivector? To find this we form 


B' = a'^b' = 5(RaR' RbRÝ — RbRÝ Rak’) 
= { R(ab — ba) Rİ 
= Ra\bR' 
= RBRÍ, (2.136) 


where we have used the rotor normalisation formula R'R = 1. Bivectors are 
rotated using precisely the same formula as vectors! The same turns out to be 
true for all geometric multivectors, and this is one of the most attractive features 
of geometric algebra. In section 4.2 we prove that the transformation A> RAR! 
preserves the grade of the multivector on which the rotors act. For applications 
in three dimensions we only need check this result for the trivector case, as we 
have already demonstrated it for vectors and bivectors. The pseudoscalar in 
three dimensions, J, commutes with all other terms in the algebra, so we have 


RIR' = IRR = 1, (2.137) 


48 


2.7 ROTATIONS 


which is certainly grade-preserving. This result is one way of saying that ro- 
tations have determinant +1. We now have a means of rotating all geometric 
objects in three dimensions. In chapter 3 we will take full advantage of this when 
studying rigid-body dynamics. 


2.7.4 Rotor composition law 


Having seen how individual rotors are used to represent rotations, we now look 
at their composition law. Let the rotor R transform the vector a into a vector b: 


b= Ria Ri. (2.138) 
Now rotate b into another vector c, using a rotor Rp. This requires 
c= RəbR} = R2R,aR' Ri = RyRia(R2R,)', (2.139) 


so that if we write 
c= Rakt, (2.140) 


then the composite rotor is given by 
R= RR. (2.141) 


This is the group combination rule for rotors. Rotors form a group because the 
product of two rotors is a third rotor, as can be checked from 


RəRı(R2R1)! = RoR, Rİ Rİ = RR} = 1. (2.142) 


In three dimensions the fact that the multivector R contains only even-grade 
elements and satisfies RR' = 1 is sufficient to ensure that R is a rotor. The 
fact that rotors form a continuous group (called a Lie group) is a subject we will 
return to later in this book. 

Rotors are the exception to the rule that all multivectors are subject to a 
double-sided transformation law. Rotors are already mixed-grade objects, so 
multiplying on the left (or right) by another rotor does not take us out of the 
space of rotors. All geometric entities, such as lines and planes, are single-grade 
objects, and their grades cannot be changed by a rotation. They are therefore 
all subject to a double-sided transformation law. Again, this brings us back to 
the central theme that both geometric objects and the operators acting on them 
are contained in a single algebra. 

The composition rule (2.141) has a surprising consequence. Suppose that the 
rotor R, is kept fixed, and we set Ra = exp(—B6/2). We now take the vector c 
on a 27 excursion back to itself. The final rotor R is 


R=?" R; = —R,. (2.143) 


The rotor has changed sign under a 27 rotation! This is usually viewed as 
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a quantum-mechanical phenomenon related to the existence of fermions. But 
we can now see that the result is classical and is simply a consequence of our 
rotor description of rotations. (The relationship between rotors and fermion 
wavefunctions is discussed in chapter 8.) A geometric interpretation of the dis- 
tinction between R and —R is provided by the direction in which a rotation is 
performed. Suppose we want to rotate e; onto e2. The rotor to achieve this is 


R(0) = e7°1°20/2, (2.144) 


If we rotate in a positive sense through 7/2 the final rotor is given by 


1 
R(w/2) = —=(1 - ere). 2.145 
(1/2) = gC eve) (2.145) 
If we rotate in the negative (clockwise) sense, however, the final rotor is 
1 
R(—87/2) = ——=(1 — e1e2) = —R(a/2). 2.146 
(-37/2) =~ (1 — eves) = -R(7/2) (2.146) 


So, while R and —R define the same absolute rotation (and the same rotation 
matrix), their different signs can be employed to record information about the 
handedness of the rotation. 

The rotor composition rule provides a simple formula for the compound effect 
of two rotations. Suppose that we have 


Rı =e Bil? Ry = e7P202/2, (2.147) 
where both Bı and B are unit bivectors. The product rotor is 
R =(cos(62/2) — sin(42/2)B2) (cos(61/2) — sin(61 /2)B1) 
= c08(02/2) cos(01/2) — (cos(02/2) sin(01 /2) By + cos(0/2) sin(02/2)B2) 
+ sin(62/2) sin(6,/2) By Bo. (2.148) 


So if we write R = R2R, = exp(—B6/2), where B is a new unit bivector, we 
immediately see that 


cos(9/2) = cos(62/2) cos(61/2) + sin(62/2) sin(@1 /2) (By B2) (2.149) 
and 
sin(0/2)B = cos(62/2) sin(01/2) By + cos(61/2) sin(@2/2) B2 
ms sin(62/2) sin(6; /2) (Bı Bo)o. (2.150) 


These half-angle relations for rotations were first discovered by the mathemati- 
cian Rodriguez, three years before the invention of the quaternions! It is well 
known that these provide a simple means of calculating the compound effect of 
two rotations. Numerically, it is usually even simpler to just multiply the rotors 
directly and not worry about calculating any trigonometric functions. 
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2.7.5 Euler angles 


A standard way to parameterise rotations is via the three Euler angles {¢, 0, Y}. 
These are defined to rotate an initial set of axes, {e1,e2,e3}, onto a new set 
{ej}, €5,e5} (often denoted x, y, z and 2’, y’, z’ respectively). First we rotate 
about the e3 axis — i.e. in the ee plane — anticlockwise through an angle @. 
The rotor for this is 


Ry = 81629/2, (2.151) 


Next we rotate about the axis formed by the transformed e; axis through an 
amount 0. The plane for this is 


TReeiR}, = Reeves R}. (2.152) 
The rotor is therefore 
Ro = exp(—RgeresR1,0/2) = Rye 2099/7 R} (2.153) 
The intermediate rotor is now 
R! = RgRg = e™®!®2/2e7°2830/2, (2.154) 


Note the order! Finally, we rotate about the transformed e3 axis through an 
angle w. The appropriate plane is now 


[R'e,R" = R'ejea R? (2.155) 
and the rotor is 
Ry = exp(—R’eye. Rb /2) = Rie 12/2 RT (2.156) 
The resultant rotor is therefore 
R = Ry R! = c—01829/29—€2838/2 e122 /2, (2.157) 


which has decoupled very nicely and is really quite simple — it is much easier to 
visualise and work with than the equivalent matrix formula! Now that we have 
geometric algebra at our disposal we will, in fact, have little cause to use the 
Euler angles in calculations. 


2.8 Notes 


In this chapter we have given a lengthy introduction to geometric algebra in two 
and three dimensions. The latter algebra is generated entirely by three basis 
vectors {e1, €2,€3} subject to the rule that e;e; + eje; = 26;;. This simple rule 
generates an algebra of remarkable power and richness which we will explore in 
following chapters. 

There is a large literature on the geometric algebra of three-dimensional space 
and its applications in physics. The most complete text is New Foundations 
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for Classical Mechanics by David Hestenes (1999). Hestenes has also written 
many papers on the subject, most of which are listed in the bibliography at 
the end of this book. Other introductory papers have been written by Gull, 
Lasenby and Doran (1993a), Doran et al. (1996a) and Vold (1993a, 1993b). 
Clifford’s Mathematical Papers (1882) are also of considerable interest. The 
use of geometric algebra for handling rotations is very common in the fields 


of engineering and computer science, though often purely in the guise of the 
quaternion algebra. Searching one of the standard scientific databases with the 
keyword ‘quaternions’ returns too many papers to begin to list here. 
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2.9 Exercises 


From the properties of the geometric product, show that the symmet- 
rised product of two vectors satisfies the properties of a scalar product, 
as listed in section 1.2. 

By expanding the bivector a^b in terms of geometric products, prove 
that it anticommutes with both a and b, but commutes with any vector 
perpendicular to the a^b plane. 

Verify that the E; and Ez matrices of equation (2.27) satisfy the correct 
multiplication relations to form a representation of G2. Use these to 
verify equations (2.26). 

Construct the multiplication table generated by the orthonormal vectors 
e1, eg and e3. Do these generate a (finite) group? 

Prove that all of the following forms are equivalent expressions of the 
vector cross product: 


axb = —IaAb = b- (Ia) = —a- (Ib). 

Interpret each form geometrically. Hence establish that 

ax(bxc) = —a-(bAc) = —(a-bc— a-cb) 
and 

a- (bxc) = [a,b,c] = aAbAcI™t. 

Prove that the effect of successive reflections in the planes perpendicular 
to the vectors m and n results in a rotation through twice the angle 
between m and n. 
What is the reverse of RaR', where a is a vector? Which objects in 
three dimensions have this property, and why must the result be another 


vector? 
Show that the rotor 


2.9 EXERCISES 


2.9 


2.10 


can also be written as exp(—B0/2), where B is the unit bivector in the 
a ^b plane and @ is the angle between a and b. 
The Cayley—Klein parameters are a set of four real numbers a, 8, y and 
6 subject to the normalisation condition 

+PP +E. 


These can be used to paramaterise an arbitrary rotation matrix as fol- 


lows: 
a? +8 — 7? — & 2(By + ad) 2(865 — ay) 
U= 2(Gy — ad) a? — B27 +77 — 2(76 + aß) 
2(86 + a7) Ay5—aB) -P-P 


Can you relate the Cayley—Klein parameters to the rotor description? 
Show that the set of all rotors forms a continuous group. Can you 
identify the group manifold? 
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Classical mechanics 


In this chapter we study the use of geometric algebra in classical mechanics. 
We will assume that readers already have a basic understanding of the subject, 
as a complete presentation of classical mechanics with geometric algebra would 
require an entire book. Such a book has been written, New Foundations for Clas- 
sical Mechanics by David Hestenes (1999), which looks in detail at many of the 
topics discussed here. Our main focus in this chapter is to areas where geometric 
algebra offers some immediate benefits over traditional methods. These include 
motion in a central force and rigid-body rotations, both of which are dealt with 
in some detail. More advanced topics in Lagrangian and Hamiltonian dynamics 
are covered in chapter 12, and relativistic dynamics is covered in chapter 5. 

Classical mechanics was one of the areas of physics that prompted the devel- 
opment of many of the mathematical techniques routinely used today. This is 
particularly true of vector analysis, and it is now common to see classical me- 
chanics described using an abstract vector notation. Many of the formulae in this 
chapter should be completely familiar from such treatments. A key difference 
comes in adopting the outer product of vectors in place of the cross product. This 
means, for example, that angular momentum and torque both become bivectors. 
The outer product is clearer conceptually, but on its own it does not bring any 
calculational advantages. The main new computational tool we have at our dis- 
posal is the geometric product, and here we highlight a number of examples of 
its use. 

In this chapter we have chosen to write all vectors in a bold font. This is 
conventional for three-dimensional physics and many of the formulae presented 
below look unnatural if this notation is not followed. Bivectors and other general 
multivectors are left in regular font, which helps to distinguish them from vectors. 
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3.1 Elementary principles 


We start by considering a point particle with a trajectory a(t) described as a 
function of time. Here x is the position vector relative to some origin and the 
time ¢ is taken as some absolute ‘Newtonian’ standard on which all observers 
agree. The particle has velocity 


v= b= — (3.1) 


where the overdot denotes differentiation with respect to time t. If the particle 
has mass m, then the momentum p is defined by p = mv. Newton’s second law 
of motion states that 


p=f, (3.2) 


where the vector f is the force acting on the particle. Usually the mass m 
is constant and we recover the familiar expression f = ma, where a is the 
acceleration 


The case of constant mass is assumed throughout this chapter. The path for 
a single particle is then determined by a second-order differential equation (as- 
suming f does not depend on higher derivatives). 

The work done by the force f on a particle is defined by the line integral 


te 2 
Wie =| fvat= f f-ds. (3.4) 
$i 1 


The final form here illustrates that the integral is independent of how the path 
is parameterised. From Newton’s second law we have 


te d 
Wiz = mf -vdt = of a! t, (3.5) 
tı tı 


where v = |v| = y(v?). It follows that the work done is equal to the change in 
kinetic energy T, where 


mv. (3.6) 


In the case where the work is independent of the path from point 1 to point 2 the 
force is said to be conservative, and can be written as the gradient of a potential: 


f=-VV. (3.7) 


For conservative forces the work also evaluates to 
2 
Wiz = -f ds:-VV = Vi = Və (3.8) 
1 
and the total energy E = T + V is conserved. 
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Figure 3.1 Angular momentum. The particle sweeps out the plane L = 
zp. The angular momentum should be directly related to the area swept 
out (cf. Kepler’s second law), so is naturally encoded as a bivector. The 
position vector x depends on the choice of origin. 


3.1.1 Angular momentum 


Angular momentum is traditionally discussed in terms of the cross product, even 
though it is quite clear that what is required is a way of encoding the area swept 
out by a particle as it moves relative to some origin (see figure 3.1). We saw in 
chapter 2 that the exterior product provides this, and that the more traditional 
cross product is a derived concept based on the three-dimensional result that 
every directed plane has a unique normal. We therefore have no hesitation 
in dispensing with the traditional definition of angular momentum as an axial 
vector, and replace it with a bivector. So, if a particle has momentum p and 
position vector x from some origin, we define the angular momentum of the 
particle about the origin as the bivector 


L=aap. (3.9) 


This definition does not alter the steps involved in computing L since the com- 
ponents are the same as those of the cross product. We will see, however, that 
the freedom we have to now use the geometric product can speed up derivations. 
The definition of angular momentum as a bivector maintains a clear distinction 
with vector quantities such as position and velocity, removing the need for the 
rather awkward definitions of polar and axial vectors. The definition of L as a 
bivector also fits neatly with the rotor description of rotations, as we shall see 
later in this chapter. 
If we differentiate L we obtain 


dL 


E = vA (mv) + rt^ (ma) =g^f. (3.10) 
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We define the torque N about the origin as the bivector 


N=af, (3.11) 
so that the torque and angular momentum are related by 
dL 
— =N 3.12 
a (3.12) 


The idea of the torque being a bivector is also natural as torques act over a 
plane. The plane in question is defined by the vector f and the chosen origin, 
so both L and N depend on the origin. Recall also that bivectors are additive, 
much like vectors, so the result of applying two torques is found by adding the 
respective bivectors. 

The angular momentum bivector can be written in an alternative way by first 
defining r = |x| and writing 


L=rÈ. (3.13) 
We therefore have 
d ; 
t= — (r2) =i +r, (3.14) 
dt 
so that 
L= mgæhn(i® + rb) = mrn (iê + rae) = mr ere. (3.15) 
But since #7 = 1 we must have 
0= La =2ĉ-ĝ. (3.16) 


We can therefore eliminate the outer product in equation (3.15) and write 


2 2 


L= mr ê = —mr° xe, (3.17) 


which is useful in a number of problems. 


3.1.2 Systems of particles 


The preceding definitions generalise easily to systems of particles. For these it 
is convenient to distinguish between internal and external forces, so the force on 
the ith particles is 


X fut fi =p. (3.18) 
J 


Here fẹ is the external force and fij is the force on the jth particle due to the 
ith particle. We assume that f,, = 0. Newton’s third law (in its weak form) 
states that 


fij =-fji (3.19) 
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This is not obeyed by all forces, but is assumed to hold for the forces considered 
in this chapter. Summing the force equation over all particles we find that 


Soma = fY fy) F (3.20) 
i i ij i 


All of the internal forces cancel as a consequence of the third law. We define the 
centre of mass X by 


1 
X= M o i (3.21) 


where M is the total mass 


M=S m, (3.22) 
The position of the centre of mass is governed by the force law 


2X 
a: =y fF (3.23) 


i 
and so only responds to the total external force on the system. The total mo- 
mentum of the system is defined by 


dX 
P=) p =M 3.24 
- Pi di (3.24) 


and is conserved if the total external force is zero. 
The total angular momentum about the chosen origin is found by summing 
the individual bivector contributions, 


L= 5) tinpi (3.25) 
The rate of change of L is governed by 


i i tj 


The final term is a double sum containing pairs of terms going as 


The strong form of Newton’s third law states that the interparticle force f;; is 
directed along the vector x; — x; between the two particles. This law is obeyed 
by a sufficiently wide range of forces to make it a useful restriction. (The most 
notable exception to this law is electromagnetism.) Under this restriction the 
total angular momentum satisfies 

dL x 

n N*, (3.28) 
where N° is the total external torque. If the applied external torque is zero, and 
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the strong law of action and reaction is obeyed, then the total angular momentum 
is conserved. 

A useful expression for the angular momentum is obtained by introducing a 
set of position vectors relative to the centre of mass. We write 


xe, =x, +X, (3.29) 
so that 


X miz, =0. (3.30) 


The velocity of the ith particle is now 
vi =v; +v, (3.31) 


where v = X is the velocity of the centre of mass. The total angular momentum 
contains four terms: 


L= So (X nmw + a Amu, + mix, nv +X Amv). (3.32) 


7 


The final two terms both contain factors of X` m,a/, and so vanish, leaving 


L=XAP+) aap). (3.33) 


The total angular momentum is therefore the sum of the angular momentum of 
the centre of mass about the origin, plus the angular momentum of the system 
about the centre of mass. In many cases it is possible to chose the origin so 
that the centre of mass is at rest, in which case L is simply the total angular 
momentum about the centre of mass. Similar considerations hold for the kinetic 
energy, and it is straightforward to show that 


T=) imo? = IM+} Y mini’. (3.34) 


3.2 Two-body central force interactions 


One of the most significant applications of the preceding ideas is to a system 
of two point masses moving under the influence of each other. The force acting 
between the particles is directed along the vector between them, and all external 
forces are assumed to vanish. It follows that both the total momentum P and 
angular momentum L are conserved. 

We suppose that the particles have positions x; and a2, and masses mı and 
mg. Newton’s second law for the central force problem takes the form 


mı = f, (3.35) 
M22 = —f, (3.36) 
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where f is the interparticle force. We define the relative separation vector æ by 


£ = £1 — T2. (3.37) 
This vector satisfies 
mimt = (mı + mə) f. (3.38) 
We accordingly define the reduced mass u by 
ee (3.39) 


7 
H My ms, 


so that the final force equation can be written as 
pe = f. (3.40) 


The two-body problem has now been reduced to an equivalent single-body equa- 
tion. The strong form of the third law assumed here means that the force f is 
directed along xz, so we can write f as fz. 

We next re-express the total angular momentum in terms of the centre of mass 
X and the relative vector x. We start by writing 


ML, = mı X + HT, M2£t2 = mX — uT. (3.41) 
It follows that the total angular momentum Lz is given by 


Li = Mg AN2L4 + MILA £92 
=MXAX + pane. (3.42) 


We have assumed that there are no external forces acting, so both L, and P are 
conserved. It follows that the internal angular momentum is also conserved and 
we write this as 


L= uzni. (3.43) 


Since L is constant, the motion of the particles is confined to the L plane. The 
trajectory of « must also sweep out area at a constant rate, since this is how 
L is defined. For planetary motion this is Kepler’s second law, though he did 
not state it in quite this form. Kepler treated the sun as the origin, whereas L 
should be defined relative to the centre of mass. 

The internal kinetic energy is 


T= Zui? = Sure + ra)? = Sure + Lire? (3.44) 


From equation (3.17) we see that 


4g. (3.45) 


L = -priti = -pr 
We therefore define the constant l as the magnitude of L, so 


l= pr?|a|. (3.46) 


3.2 TWO-BODY CENTRAL FORCE INTERACTIONS 


The kinetic energy can now be written as a function of r and 7 only: 


2 2 
br l 
T= — : A 
2 T Qur? (3.47) 
The force f is conservative and can be written in terms of a potential V(r) as 
f= fx=-VV(*), (3.48) 
where 
dV 
dr ey) 
Since the force is conservative the total energy is conserved, so 
2 2 
br l 
E = — i 
7 + Sur? +V(r) (3.50) 


is a constant. For a given potential V(r) this equation can be integrated to find 
the evolution of r. The full motion can then be recovered from L. 


3.2.1 Inverse-square forces 


The most important example of a two-body central force interaction is that 
described by an inverse-square force law. This case is encountered in gravitation 
and electrostatics and has been analysed in considerable detail by many authors 
(see the end of this chapter for suggested additional reading). In this section 
we review some of the key features of this system, highlighting the places where 
geometric algebra offers something new. An alternative approach to this problem 
is discussed in section 3.3. 
Writing f = —k/r? the basic equation to solve is 


pe = -zÊ = wr (3.51) 
The sign of k determines whether the force is attractive or repulsive (positive 
for attractive). This is a second-order vector differential equation, so we expect 
there to be two constant vectors in the solution — one for the initial position 
and one for the velocity. We already know that the angular momentum L is a 
constant of motion, and we can write this as 


L= ur? = -ur ee. (3.52) 
It follows that 
k s 
Li = pat = ka, (3.53) 
which we can write in the form 
“(Le — kx) =0. (3.54) 
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Eccentricity Energy Orbit 
e>1 E>0 Hyperbola 
e=1 E=0 Parabola 
e<l E <0 Ellipse 
e=0 E =—pk?/(2l?) Circle 


Table 3.1 Classification of orbits for an inverse-square force law. 


The motion is therefore described by the simple equation 
Lv = k(ĉ2 + e), (3.55) 


where the eccentricity vector e is a second vector constant of motion. This vector 
is also known in various contexts as the Laplace vector and as the Runge-Lenz 
vector. From its definition we can see that e must lie in the L plane. 

To find a direct equation for the trajectory we first write 


jia 
Lux = L(v-x + vrg) = -LL +v-x L= k(r + ex). (3.56) 
u 
The scalar part of this equation gives 
[2 


This equation specifies a conic surface in three dimensions with symmetry axis 
e. The surface is formed by rotating a two-dimensional conic about this axis. 
Since the motion takes place entirely within the L plane the motion is described 
by a conic. That is, the trajectory x(t) is one of a hyperbola, parabola, ellipse 
or circle. The generic cases are ellipses for bound orbits and hyperbolae for free 
states. The cases of parabolic and circular orbits are exceptional as they require 
precise values of |e] (table 3.1). 

In L and e we have found five of the six constants of motion (we only have 
two arbitrary constants in e as it is constrained to lie in the L plane). The 
final constant specifies where on the conic we start at time t = 0. We know 
that the energy is also a constant of motion, so it should be possible to express 
the energy directly in terms of L and e. From equation (3.51) we see that the 
potential energy must go as k/r, provided we set the arbitrary constant so that 
V = 0 at infinity. The full energy is therefore given by 


Mo K 
E= zv —-. ; 
ao oe (3.58) 
To simplify this we first form 
LovL = Pv? = k? (ĉ + e}. (3.59) 
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It follows that 
uk? 3- y k uk? 
E = —— 1+2ĉ2-e)- -= = 
T a 5 op 
where e = |e| is the eccentricity. The sign of the energy is governed entirely by e. 
Since the potential is set to zero at infinity, all bound states must have negative 
energy and hence an eccentricity e < 1. The limiting case of e = 1 describes a 


parabola (table 3.1). 


(e =1), (3.60) 


3.2.2 Motion in time for elliptic orbits 


Many methods can be used to find the trajectory as a function of time and these 
are discussed widely in the literature. Here we describe one of the simplest, which 
serves to highlight the essential difficulty of this problem. An alternative solution, 
which more fully exploits the techniques of geometric algebra, is described in 
section 3.3. From the energy equation we see that 
2 
per? = 24E — = + es (3.61) 
r r 


so t is given by 


i rdr 
V= I l (3.62) 
ro (2ukr + 2uEr? — 1?)1/2 


Evaluating this integral results in a rather complicated function of r, the general 
form of which is hard to invert and not very helpful. More useful formulae 
are obtained by specialising to one form of orbit. For bound problems we are 
interested in elliptic orbits for which E is negative. For these orbits it is useful 
to introduce the semi-major axis a defined by 


a=5(ritre)=— (3.63) 


2E’ 
where rı and r2 are the maximum and minimum values of r respectively. In 
terms of this we can write 


k k 
2ukr + QuEr? — P= a (r? — 2ar) —-P? = a (ae? — (r —a)”). (3.64) 
a a 
We now introduce a new variable W, the eccentric anomaly, defined by 
r = a(1 — ecos(®)). (3.65) 


In terms of this we find 


t= (+) we ie (1 — ecos(W)) dv, (3.66) 


Wo 


so if we choose t = 0 to correspond to closest approach we have 


wt = Y —esin(WV), (3.67) 


63 


CLASSICAL MECHANICS 


where 
k 
io (3.68) 
pa 

Equations (3.65) and (3.67) provide a parametric solution relating r and t. This 
solution highlights the fact that the equation relating t and r is transcendental 
and does not have a simple closed form. The time taken for one orbit is 2r/w, 
so the orbital period 7 is related to the major axis a by 


TL An” 
k 
This gives us the third of Kepler’s three laws of planetary motion, that the square 
of the period is proportional to the cube of the major axis. 


a’. (3.69) 


3.3 Celestial mechanics and perturbations 


By far the most important application of the Newtonian theory of gravitation is 
to the motion of the planets in the solar system. This is a complicated subject 
of considerable historical and current importance, and we will only touch on a 
few applications. Detailed calculation of the motions of all of the planets in the 
solar system still represents a major computational challenge. Aside from the 
obvious problem of having to calculate the gravitational effects of every planet on 
every other planet, further effects must also be incorporated. These can include 
deviations of the shapes of the planets from spherical, the effects of tidal forces 
and ultimately general relativistic corrections. 

A significant number of problems in celestial mechanics are best treated us- 
ing perturbation theory. In this technique orbits are calculated as a series of 
ever smaller deviations from Kepler orbits. Since the Kepler orbit is specified 
entirely by L and e, we should first form equations for these in the presence of 
a perturbing force. We modify the force law to read 


x k 
He = — zT +f, (3.70) 


and assume that f is always small compared with the inverse-square term. The 
angular momentum L now satisfies 


L=a/f, (3.71) 


so L is now only conserved if f is also a central force. With the eccentricity 
vector still defined by equation (3.55), we find that 


. 1 
kè=Lv+-L f. (3.72) 
H 


Only five of the six equations for L and e are independent, as we always have 
L^e=0. 
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For many problems the variation in L and e is slow compared to the orbital 
period. For these a useful approximation is obtained by finding the orbital 
average of f over one cycle, with L and e held constant. The quantities L and e 
are then assumed to vary slowly under the influence of the time-averaged force. 
Results for the orbital averages of numerous quantities can be found tabulated 
in many textbooks and are discussed in the exercises at the end of this chapter. 


3.3.1 Example — general relativistic perturbations 


Later in this book we will study how general relativity modifies the Newtonian 
view of gravity. For particles moving in a central potential, the modification 
is quite simple and can be handled efficiently using perturbation theory. The 


modified force law is 
M 2 
one (1 pa Ja (3.73) 


r2 Er? 


where c is the speed of light and we have replaced k by the gravitational expres- 
sion GM. (A small subtlety is that the derivatives here are with respect to 
proper time, but this does not affect our reasoning.) The force is still central, so 
the angular momentum L is still conserved. The eccentricity vector satisfies the 
simple equation 
2 

e= plê. (3.74) 
For bound orbits this gives rise to a precession of the major axis (see figure 3.2). 
The quantity of most interest is the amount e changes in one orbit. To get an 
approximate result for this we use the time-averaging idea and assume that the 
orbit is precisely elliptical. We therefore have 


3? [T ê 


where T is the orbital period. Evaluating this integral is left as an exercise, and 
the final result is 


6rGM r 
Ae = ——_——e:-L : 
e ie Aa? j (3.76) 


where Ê = L/l. This gives a precession of e with the orientation of L, which 
corresponds to an advance (figure 3.2). For Mercury this gives rise to the fa- 
mous advance in the perihelion of 43 arcseconds per century, which was finally 
explained by general relativity. 
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Figure 3.2 Orbital precession. The plot shows a modified orbit as pre- 
dicted by general relativity. The ellipse precesses round in the same direc- 
tion as the orbital motion. The parameters have been chosen to exaggerate 
the precession effect. 


3.3.2 Spinor equations 


An alternative method for analysing the Kepler problem is through the use of 
‘spinors’. These will be defined more carefully in later chapters, but in two and 
three dimensions they can be viewed as elements of the subalgebra of Gz and G3 
consisting entirely of even elements. In two dimensions a spinor can therefore be 
identified with a complex number. The position vector x in two dimensions can 
be formed through a rotation and dilation via the polar decomposition 


x = eyr exp(ĝe1e2) = r exp(—Oej e2)e1, (3.77) 


where {e), e2} denote a right-handed orthonormal frame and we assume that the 
vector lies in the e;eg plane. We know from chapter 2 that the rotation formula 
only extends to higher dimensions if a double-sided prescription is adopted, so 
we write the vector x as 


a = Ua Ut = Ue, = eUt”. (3.78) 


In writing this we have placed all of the dynamics in the complex number U. 
For the Kepler problem it turns out that the equation for U is considerably 
easier than that for x. We assume that the plane of L is given by e1e2, and start 
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by forming 
r= |x| = UUt. (3.79) 
(Recall that, for a scalar + bivector combination in two dimensions, the reverse 
operator is the same as complex conjugation.) On differentiating we find that 
t = 2UUe, (3.80) 
hence 
2rÙ = &e,U! = eV ey. (3.81) 
We now introduce the new variable s defined by 
d d dt 


qe ae ao (3.82) 


In terms of this 


dU 
2— = «Ue, (3.83) 
ds 
and 
dU p . dU 5 : 
27g = 7ËUe + & ei = U (žr + $a”). (3.84) 
Now suppose we have motion in a central inverse-square force: 
pe = -kZ (3.85) 
r 
The equation for U becomes 
dU 1 k E 
= —U( 1yx? = —U 3.86 
ds? 2u (zua =) Qu” 80) 


which is simply the equation for harmonic motion! This has a number of advan- 
tages. First of all, the equation is easy to solve. If we set 


E 
2 | fms 
w= Zp (3.87) 
then the general solution is 
U = Aexp(Îws) + Bexp(—Lws), (3.88) 


where A and B are constants and L is the unit bivector for the plane of motion. 
The motion is illustrated in figure 3.3. The particle trajectory maps out an 
ellipse with the origin at one focus, whereas U defines an ellipse with the origin 
at the centre. The particle completes two orbits for each full cycle of U. 
Further advantages of formulating the dynamics in terms of U are that the 
equation for U is linear, so is better suited to perturbation theory, and that there 
is no singularity at r = 0, which provides better numerical stability. (Removing 
this singularity is called ‘regularization’.) In addition, equation (3.86) is universal 
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Figure 3.3 Solution to the Kepler problem. The particle orbit is shown on 
the left, and the corresponding spinor on the right. The particle completes 
two orbits every time U completes one cycle, since U and —U describe the 
same position. 


— it holds for E > 0 and E < 0. The solution when E > 0 simply has 
trigonometric functions replaced by exponentials. This universality is important, 
because perturbations can often send bound orbits into unbound ones. 

For the method to be truly powerful, however, it must extend to three dimen- 
sions. The relevant formula in three dimensions is 


x = UeU', (3.89) 


where U is a general even element. This means that U has four degrees of 
freedom now, whereas only three are required to specify æ. We are therefore free 
to impose a further additional constraint on U, which we will use to ensure the 
equations take on a convenient form. The quantity UU" is still a scalar in three 
dimensions, so we have 


r=UUt = UŻU. (3.90) 
We next form £: 
& = Ue,Ut + Ue Ut. (3.91) 


We would like this to equal 2Ue,U? for the preceding analysis to follow through. 
For this to hold we require 


Ue,Ut — Ve Ut = UeUt — (VeUt)t = 0. (3.92) 


The quantity Ve,Ut only contains odd grade terms (grade-1 and grade-3). If we 
subtract its reverse, all that remains is the trivector (pseudoscalar) term. We 
therefore require that 


(Ue,U")3 = 0, (3.93) 


which we adopt as our extra condition on U. With this condition satisfied we 
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have 
ge aie, (3.94) 
ds 
and 
dU 
ds? 
For an inverse-square force law we therefore recover the same harmonic oscillator 
equation. In the presence of a perturbing force we have 
d?U 
2H — EU = fxU = rfUe. (3.96) 


= (x + $a°)U. (3.95) 


This equation for U can be handled using standard techniques from perturbation 
theory. The equation was first found (in matrix form) by Kustaanheimo and 
Stiefel in 1964 . The analysis was refined and cast in its present form by Hestenes 
(1999). 


3.4 Rotating systems and rigid-body motion 


Rigid bodies can be viewed as another example of a system of particles, where 
now the effect of the internal forces is to keep all of the interparticle distances 
fixed. For such systems the internal forces can be ignored once one has found a 
set of dynamical variables that enforce the rigid-body constraint. The problem 
then reduces to solving for the motion of the centre of mass and for the angular 
momentum in the presence of any external forces or torques. Suitable variables 
are a vector x(t) for the centre of mass, and a set of variables to describe the 
attitude of the rigid body in space. Many forms exist for the latter variables, 
but here we will concentrate on parameterising the attitude of the rigid body 
with a rotor. Before applying this idea to rigid-body motion, we first look at the 
description of rotating frames with rotors. 


3.4.1 Rotating frames 


Suppose that the frame of vectors {fx} is rotating in space. These can be related 
to a fixed orthonormal frame {ex} by the time-dependent rotor R(t): 


f,(t) = R(t)e, Ri (t). (3.97) 
The angular velocity vector w is traditionally defined by the formula 
fk = wx fp, (3.98) 


where the cross denotes the vector cross product. From section 2.4.3 we know 
that the cross product is related to the inner product with a bivector by 
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We are now used to the idea that angular momentum is best viewed as a bivector, 
and we must expect the same to be true for angular velocity. We therefore define 
the angular velocity bivector Q by 


Q= Iw. (3.100) 


This choice ensures that the rotation has the orientation implied by Q. 
To see how Q is related to the rotor R we start by differentiating equa- 
tion (3.97): 
fp = Re, R? + Re, Ri = RR, + fp RR. (3.101) 
From the normalisation equation RR' = 1 we find that 
d 


0= g ER) = RR + RR. (3.102) 
Since differentiation and reversion are interchangeable operations we now have 
RR? = -RR' = -(RRÝ V. (3.103) 


The quantity RRt is equal to minus its own reverse and has even grade, so must 
be a pure bivector. The equation for fg now becomes 


f, = RR'f, — fp RR? = (2RR')-fp. (3.104) 


Comparing this with equation (3.99) and equation (3.100) we see that 2RRt 
must equal minus the angular velocity bivector Q, so 


2RRt = Q. (3.105) 
The dynamics is therefore contained in the single rotor equation 

R=—10R. (3.106) 
The reversed form of this is also useful: 

Ri = iRio, (3.107) 


Equations of this type are surprisingly ubiquitous in physics. In the more general 
setting, rotors are viewed as elements of a Lie group, and the bivectors form their 
Lie algebra. We will have more to say about this in chapter 11. 


3.4.2 Constant N 
For the case of constant 2 equation (3.106) integrates immediately to give 
R=e */2 Ro, (3.108) 


which is the rotor for a constant frequency rotation in the positive sense in the 
Q plane. The frame rotates according to 


fe (t) =e @#/? Roe, Rie®/2. (3.109) 
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e3 = f3 


Figure 3.4 Orientation of the angular velocity bivector. Q has the orien- 
tation of f; Afi. It must therefore have orientation +e, \e2 when w = e3. 


The constant term Ro describes the orientation of the frame at t = 0, relative to 
the {ex} frame. 

As an example, consider the case of motion about the e3 axis (figure 3.4). We 
have 


Q = wlez = wees, (3.110) 
and for convenience we set Rp = 1. The motion is described by 
f(t) = exp(—derenwt) ek exp (Fereswt) ; (3.111) 
so that the fı axis rotates as 
fı = e1 exp(e1e2wt) = cos(wt)ey + sin(wt)er. (3.112) 


This defines a right-handed (anticlockwise) rotation in the e,e2 plane, as pre- 
scribed by the orientation of Q. 


3.4.38 Rigid-body motion 


Suppose that a rigid body is moving through space. To describe the position 
in space of any part of the body, we need to specify the position of the centre 
of mass, and the vector to the point in the body from the centre of mass. The 
latter can be encoded in terms of a rotation from a fixed ‘reference’ body onto 
the body in space (figure 3.5). We let £o denote the position of the centre of 
mass and y;(t) denote the position (in space) of a point in the body. These are 
related by 


y,(t) = R(t)a,R'(t) + a(t), (3.113) 
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Figure 3.5 Description of a rigid body. The vector xo(t) specifies the 
position of the centre of mass, relative to the origin. The rotor R(t) defines 
the orientation of the body, relative to a fixed copy imagined to be placed 
at the origin. x is a vector in the reference body, and y is the vector in 
space of the equivalent point on the moving body. 


where æ; is a fixed constant vector in the reference copy of the body. In this 
manner we have placed all of the rotational motion in the time-dependent rotor 
R(t). 
The velocity of the point y = RxR! + xo is 
v(t) = RaRt + ReRt + xo 

= —5ORGR' + 4 RæRÝQ + vo 

= (R&R')-0 + vo, (3.114) 
where vo is the velocity of the centre of mass. The bivector 2 defines the plane of 
rotation in space. This plane will lie at some orientation relative to the current 
position of the rigid body. For studying the motion it turns out to be extremely 
useful to transform the rotation plane back into the fixed, reference copy of the 


body. Since bivectors are subject to the same rotor transformation law as vectors 
we define the ‘body’ angular velocity Qg by 


Np = RIOR. (3.115) 
In terms of the body angular velocity the rotor equation becomes 
R=-1OR=-1ROg, Rt=10p5Rt. (3.116) 
The velocity of the body is now re-expressed as 
v(t) = Ra-Og R' + vo, (3.117) 


which will turn out to be the more convenient form. (We have used the operator 
ordering conventions of section 2.5 to suppress unnecessary brackets in writing 
Ra-Qg Rİ in place of R(a-Qg)R".) 
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To calculate the momentum of the rigid body we need the masses of each 
of the constituent particles. It is easier at this point to go to a continuum 
approximation and introduce a density p = p(x). The position vector æ is taken 
relative to the centre of mass, so we have 


[aeo=M and fè px = 0. (3.118) 
The momentum of the rigid body is simply 
[teo= | èso (Ra-Qg Rt + vo) = Mvo, (3.119) 


so is specified entirely by the motion of the centre of mass. This is the continuum 
version of the result of section 3.1.2. 


3.4.4 The inertia tensor 


The next quantity we require is the angular momentum bivector L for the body 
about its centre of mass. We therefore form 


= fè ply — zo) ^v 


= jae p(ReR')A\(Ra-Og Rt + vo) 


=R (J ae pen 2-02) ) RÌ. (3.120) 


The integral inside the brackets refers only to the fixed copy and so defines a 
time-independent function of Qg. This is the reason for working with Qg instead 
of the space angular velocity Q. We define the inertia tensor T(B) by 


T(B)= fe x px \(a-B). (3.121) 


This is a linear function mapping bivectors to bivectors. This way of writing 
linear functions may be unfamiliar to those used to seeing tensors labelled with 
indices, but the notation is the natural extension to linear functions of the index- 
free approach advocated in this book. The linearity of the map is easy to check: 


T(\A+ nB) = | de per(a ‘(AA + B)) 


pee (Aw A (aA) + pa (a-B)) 
(A) + pZ(B). (3.122) 


The fact that the inertia tensor maps bivectors to bivectors, rather than vectors 
to vectors, is also a break from tradition. This viewpoint is very natural given our 
earlier comments about the merits of bivectors over axial vectors, and provides a 
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Figure 3.6 The inertia tensor. The inertia tensor Z(B) is a linear function 
mapping its bivector argument B onto a bivector. It returns the total 
angular momentum about the centre of mass for rotation in the B plane. 


clear geometric picture of the tensor (figure 3.6). Since both vectors and bivectors 
belong to a three-dimensional linear space, there is no additional complexity 
introduced in this new picture. 

To understand the effect of the inertia tensor, suppose that the body rotates 
in the B plane at a fixed rate |B|, and we place the origin at the centre of mass 
(which is fixed). The velocity of the vector æ is simply «-B, and the momentum 
density at this point is px-B, as shown in figure 3.6. The angular momentum 
density bivector is therefore x\(px-B), and integrating this over the entire body 
returns the total angular momentum bivector for rotation in the B plane. 

In general, the total angular momentum will not lie in the same plane as the 
angular velocity. This is one reason why rigid-body dynamics can often seem 
quite counterintuitive. When we see a body rotating, our eyes naturally pick out 
the angular velocity by focusing on the vector the body rotates around. Deciding 
the plane of the angular momentum is less easy, particularly if the internal mass 
distribution is hidden from us. But it is the angular momentum that responds 
directly to external torques, not the angular velocity, and this can have some 
unexpected consequences. 

We have calculated the inertia tensor about the centre of mass, but bodies 
rotating around a fixed axis can be forced to rotate about any point. A useful 
theorem relates the inertia tensor about an arbitrary point to one about the 
centre of mass. Suppose that we want the inertia tensor relative to the point a, 
where a is a vector taken from the centre of mass. Returning to the definition 
of equation (3.121) we see that we need to compute 


T,(B) = fè p(a — a)^ ((æ — a): B). (3.123) 
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This integral evaluates to give 
T,(B) = J čro(æna-B) — x^ (a-B) — a^(x- B) + a^(a-B)) 
= T(B) + Ma^ (a- B). (3.124) 


The inertia tensor relative to a is simply the inertia tensor about the centre of 
mass, plus the tensor for a point mass M at position a. 


3.4.5 Principal azes 


So far we have only given an abstract specification of the inertia tensor. For 
most calculations it is necessary to introduce a set of basis vectors fixed in the 
body. As we are free to choose the directions of these vectors, we should ensure 
that this choice simplifies the equations of motion as much as possible. To see 
how to do this, consider the {e;} frame and define the matrix Z;; by 


This defines a symmetric matrix, as follows from the result 
A-(aA\(a-B)) = (Au(a-B)) = ((A-x)£x B) = B- (x^(æ-A)). (3.126) 


(This sort of manipulation, where one uses the projection onto grade to replace 
inner and outer products by geometric products, is very common in geometric 
algebra.) This result ensures that 


i= - | ex p(te,)-(@n(a-(Ie3))) 


= - | @xp(te;)-(wrw-(Tes)) = Tj. (3.127) 


It follows that the matrix Z;; will be diagonal if the {e;} frame is chosen to 

coincide with the eigendirections of the inertia tensor. These directions are 

called the principal axes, and we always choose our frame along these directions. 
The matrix Z;; is also positive-(semi)definite, as can be seen from 


aja; Li; = - | dx p(ta)-(en(w-(1a) 


= [ee o(w-ta))? > 0. (3.128) 


It follows that all of the eigenvalues of Z;; must be positive (or possibly zero for 
the case of point or line masses). These eigenvalues are the principal moments 
of inertia and are crucial in specifying the properties of a rigid body. We denote 
these {71, i2,%3}, so that 


Tik = Õjkik (no sum). (3.129) 
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(It is more traditional to use a capital J for the moments of inertia, but this 
symbol is already employed for the pseudoscalar.) If two or three of the principal 
moments are the same the principal axes are not uniquely specified. In this case 
one simply chooses one orthonormal set of eigenvectors from the degenerate 
family of possibilities. 

Returning to the index-free presentation, we see that the principal axes satisfy 


T(Le;) = LepZ jx = i;le;, (3.130) 


where again there is no sum implied between eigenvectors and their associated 
eigenvalue in the final expression. To calculate the effect of the inertia tensor on 
an arbitrary bivector B we decompose B in terms of the principal axes as 


B = Byle;. (3.131) 


It follows that 
3 
I(B) =X ijB;Iej = i1 B1e263 + ig Boeger + iz B3162. (3.132) 
j=1 
The fact that for most bodies the principal moments are not equal demonstrates 
that Z(B) will not lie in the same plane as B, unless B is perpendicular to one 
of the principal axes. 
A useful result for calculating the inertia tensor is that the principal axes of 
a body always coincide with symmetry axes, if any are present. This simplifies 
the calculation of the inertia tensor for a range of standard bodies, the results 
for which can be found in some of the books listed at the end of this chapter. 


3.4.6 Kinetic energy and angular momentum 


To calculate the kinetic energy of the body from the velocity of equation (3.114) 
we form the integral 


Ts } f dep RaQ Rt + vo)? 
= 5 f Pr p((e-2n) + 2vo-(R#-Qp Rİ) + v9) 
=} | depen)? + Mo? (8133) 
Again, there is a clean split into a rotational contribution and a term due to 


the motion of the centre of mass. Concentrating on the former, we use the 
manipulation 
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to write the rotational contribution as 
-30p-( f de pea(e-2p)) = -105-Z(Oz). (3.135) 


The minus sign is to be expected because bivectors all have negative squares. 
The sign can be removed by reversing one of the bivectors to construct a positive- 
definite product. The total kinetic energy is therefore 
T = {Mvh + 404-T(Op). (3.136) 
The inertia tensor is constructed from the point of view of the fixed body. 
From equation (3.120) we see that the angular momentum in space is obtained 
by rotating the body angular momentum Z(Npg) onto the space configuration, 
that is, 


L= RI(Qp) Rİ. (3.137) 


We can understand this expression as follows. Suppose that a body rotates in 
space with angular velocity Q. At a given instant we carry out a fixed rotation 
to align everything back with the fixed reference configuration. This reference 
copy then has angular velocity Qg = RQR. The inertia tensor (fixed in the 
reference copy) returns the angular momentum, given an input angular velocity. 
The result of this is then rotated forwards onto the body in space, to return L. 
The space and body angular velocities are related by Q = RNpgRİ, so the 

kinetic energy can be written in the form 
T = Mo? + 50"-L. (3.138) 


We now introduce components {wx} for both Q and Qg by writing 


3 3 
=X wfe AB => wrer. (3.139) 
k=1 k=1 
In terms of these we recover the standard expression 
3 
T=4Mv +Y dinwz. (3.140) 
k=1 


3.4.7 Equations of motion 


The equations of motion are L = N, where N is the external torque. The inertia 
tensor is time-independent since it only refers to the static ‘reference’ copy of 
the rigid body, so we find that 


L = RT(Qg)R' + RI(NQg) kt + RT(QB)R' 
= R(T(Ês) — 4NBT(Qz) + İT(NQg)Ng) Rİ. (3.141) 
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At this point it is extremely useful to have a symbol to denote one-half of the 
commutator of two bivectors. The standard symbol for this is the cross, x, so 
we define the commutator product by 


(AB — BA). (3.142) 


This notation does raise the possibility of confusion with the vector cross prod- 
uct, but as the latter is not needed any more this should not pose a problem. 
The commutator product is so ubiquitous in applications that it needs its own 
symbol, and the cross is particularly convenient as it correctly conveys the anti- 
symmetry of the product. In section 4.1.3 we prove that the commutator of 
any two bivectors results in a third bivector. This is easily confirmed in three 
dimensions by expressing both bivectors in terms of their dual vectors. 

With the commutator product at our disposal the equations of motion are now 
written concisely as 


L = R(Z(Qg) — QB XT (Np) ) Rİ. (3.143) 


The typical form of the rigid-body equations is recovered by expanding in terms 
of components. In terms of these we have 


3 3 
L = R 5 iküwkIek = 5 ikwjwk(Te;) x (Tex) Ri 
k=1 j,k=1 
3 3 
=X orlfr+ XO ejrikwjwrtfi (3.144) 
k=1 j,k l=1 


If we let Ny denote the components of the torque N in the rotating fp frame, 


3 
N =X Noli, (3.145) 
k=l 


we recover the Euler equations of motion for a rigid body: 
irw — wzwg3(iz2 — i3) = M1, 


1QW2 = waw (ts ac i1) = No, (3.146) 
13W3 = wiwolii = i2) = N3. 
Various methods can be used to solve these equations and are described in most 
mechanics textbooks. Here we will simply illustrate some features of the equa- 


tions, and describe a solution method which does not resort to the explicit co- 
ordinate equations. 
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3.4.8 Torque-free motion 


The torque-free equation L = 0 reduces to 
T(QB) —ABXT(Qz) = 0. (3.147) 


This is a first-order constant coefficient differential equation for the bivector Qp. 
Closed form solutions exist, but before discussing some of these it is useful to 
consider the conserved quantities. Throughout this section we ignore any overall 
motion of the centre of mass of the rigid body. Since L = 0 both the kinetic 
energy and the magnitude of L are constant. To exploit this we introduce the 
components 


3 
Ly = tgp, L= > Lylfy. (3.148) 
k=1 


These are the components of L in the rotating fk frame. So, even though L is 
constant, the components Lp are time-dependent. In terms of these components 
the magnitude of L is 


LU = + + (3.149) 
and the kinetic energy is 


be, te. abe 
T= s+ 74+. 3.150 

2i Qin Dig ( ) 
Both |L| and T are constants of motion, which imposes two constraints on the 
three components Lk. A useful way to visualise this is to think in terms of a 
vector l with components Lx: 


3 
L= 0 Ler = —IR'LR. (3.151) 
k=1 


This is the vector perpendicular to R'LR — a rotating vector in the fixed ref- 
erence body. Conservation of |L| means that l is constrained to lie on a sphere, 
and conservation of T restricts l to the surface of an ellipsoid. Possible paths 
for l for a given rigid body are therefore defined by the intersections of a sphere 
with a family of ellipsoids (governed by T). For the case of unequal principal 
moments these orbits are non-degenerate. Examples of these orbits are shown in 
figure 3.7. This figure shows that orbits around the axes with the smallest and 
largest principal moments are stable, whereas around the middle axis the orbits 
are unstable. Any small change in the energy of the body will tend to throw it 
into a very different orbit if the orbit of l approaches close to eo. 
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e3 


e1 


Figure 3.7 Angular momentum orbits. The point described by the vector 
l simultaneously lies on the surface of a sphere and an ellipse. The figure 
shows possible paths on the sphere for l in the case of i1 < ig < i3, with 
the 3 axis vertical. 


3.4.9 The symmetric top 


The full analytic solution for torque-free motion is complicated and requires 
elliptic functions. If the body has a single symmetry axis, however, the solution 
is quite straightforward. In this case the body has two equal moments of inertia, 
i, = tg, and the third principal moment 73 is assumed to be different. With this 
assignment es is the symmetry axis of the body. The action of the inertia tensor 
on Qp is 


T(9pB) = 11wW1€2€3 + 11W2€3e1 + 13W3€1@2 
= 40 Bp + (i3 = i1)w3les, (3.152) 


so we can write Z(Qg) in the compact form 
T(Qp) = ipg + (i3 = i1)(QpAe3)e3. (3.153) 


(This type of expression offers many advantages over the alternative ‘dyad’ no- 
tation.) The torque-free equations of motion are now 


T(Qg) = Qg xI(QB) = (is — i1) OB x ((QBAes)es). (3.154) 
Since Dg ez is a trivector, we can dualise the final term and write 
T(Qg) = —(iz — i1)e3 A (QB Ae3)QB).- (3.155) 
It follows that 
egAZ(Qp) = 0 = igual, (3.156) 
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which shows that w3 is a constant. This result can be read off directly from the 
Euler equations, but it is useful to see how it can be derived without dropping 
down to the individual component equations. The ability to do this becomes 
ever more valuable as the complexity of the equations increases. 

Next we use the result that 


iB = T(OQp) = (i3 = 1) (Qp/Ae3)e3 


= T(Qp) + (41 — t3)wsles (3.157) 
to write 
t 1 t1 =, 13 + 
N = RQBR' = L+ ———~“w3Rlesh. (3.158) 
a 1 


Our rotor equation now becomes 
. 1 
R=-}0R= -z (ZR+ Rin — i3)w3Ies). (3.159) 
1 


The right-hand side of this equation involves two constant bivectors, one mul- 
tiplying R to the left and the other to the right. We therefore define the two 
bivectors 
1 41 — i3 
Q = =L, Q, = w3 Tes, (3.160) 


al al 


so that the rotor equation becomes 
R=—-10,R—1ROQ,. (3.161) 
This equation integrates immediately to give 
R(t) = exp(—$it) Ro exp(—50,t). (3.162) 


This fully describes the motion of a symmetric top. It shows that there is an 
‘internal’ rotation in the e,e2 plane (the symmetry plane of the body). This is 
responsible for the precession of a symmetric top. The constant rotor Ro defines 
the attitude of the rigid body at t = 0 and can be set to 1. The resultant body is 
then rotated in the plane of its angular momentum to obtain the final attitude 
in space. 


3.5 Notes 


Much of this chapter follows New Foundations for Classical Mechanics by David 
Hestenes (1999), which gives a comprehensive account of the applications to clas- 
sical mechanics of geometric algebra in three dimensions. Readers are encouraged 
to compare the techniques used in this chapter with more traditional methods, 
a good description of which can be found in Classical Mechanics by Goldstein 
(1950), or Analytical Mechanics by Hand & Finch (1998). The standard reference 
for the Kustaanheimo-Stiefel equation is Linear and Regular Celestial Mechanics 
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by Stiefel and Scheifele (1971). Many authors have explored this technique, par- 
ticularly in the quaternionic framework. These include Hestenes’ ‘Celestial me- 
chanics with geometric algebra’ (1983) and the papers by Aramanovitch (1995) 
and Vrbik (1994, 1995). 


3.1 


3.2 


3.3 


3.6 Exercises 


An elliptical orbit in an inverse-square force law is parameterised in 
terms of a scalar + pseudoscalar quantity U by x = U7e;. Prove that 
U can be written 


U = Age!”® + Boe {#8 , 


where dt/ds = r, r = |x| = UU' and T is the unit bivector for the 
plane. What is the value of w? Find the conditions on Ap and Bo such 
that at time t = 0, s = 0 and the particle lies on the positive eq axis 
with velocity in the positive eg direction. For which value of s does the 
velocity point in the —e, direction? Find the values for the shortest and 
longest diameters of the ellipse, and verify that we can write 


U = V/a(1 + e) cos(ws) — V/a(1 — e)I sin(ws), 


where e is the eccentricity and a is the semi-major axis. 

For elliptical orbits the semi-major axis a is defined by a = $(r1 +r), 
where rı and rg are the distances of closest and furthest approach. Prove 
that 


12 
re a(l — e). 
Hence show that we can write 
a(l — e?) 


a 1 + ecos(0)’ 


where ecos(0) = e-&. The eccentricity vector points to the point of 
closest approach. Why would we expect the orbital average of &/r4 to 
also point in this direction? Prove that 


T & u 2T 2 
t— =e 1 (0 3(@) dé 
f d T eae | (1 + ecos(8))” cos(8) 


and evaluate the integral. 
A particle in three dimensions moves along a curve a(t) such that |v] is 
constant. Show that there exists a bivector Q such that 


ù= Qv, 


and give an explicit formula for Q. Is this bivector unique? 
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3.4 


3.5 


3.6 


3.7 


Suppose that we measure components of the position vector x in a ro- 
tating frame {f;}. By referring this frame to a fixed frame, show that 
the components of x are given by 


zi = e; (RÎ£R). 
By differentiating this expression twice, prove that we can write 
fide; = & + 0-(Q-x) +2 i+. 
Hence deduce expressions for the centrifugal, Coriolis and Euler forces 


in terms of the angular velocity bivector 2. 
Show that the inertia tensor satisfies the following properties: 
linearity: T(AA + uB) = AT(A) + uI (B) 
symmetry: (AZ(B)) = (Z(A)B). 

Prove that the inertia tensor Z(B) for a solid cylinder of height h and 
radius a can be written 

Mh? Ma? 
I(B) = => (B — Bheses) + = 


where e3 is the symmetry axis. 


(B + B ^e e3), 


For a torque-free symmetric top prove that the angular momentum, 
viewed back in the reference copy, rotates around the symmetry axis at 
an angular frequency w, where 


=i 


W = W3 : 
a 
Show that the angle between the symmetry axis and the vector l = —I L 
is given by 
igw 
cos() = i 
where /? = I? = LL. Hence show that the symmetry axis rotates in 
space in the L plane at an angular frequency w’, where 
, 433 
i, cos(0) 
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Foundations of geometric 
algebra 


In chapter 2 we introduced geometric algebra in two and three dimensions. We 
now turn to a discussion of the full, axiomatic framework for geometric algebra 
in arbitrary dimensions, with arbitrary signature. This will involve some dupli- 
cation of material from chapter 2, but we hope that this will help reinforce some 
of the key concepts. Much of the material in this chapter is of primary relevance 
to those interested in the full range of applications of geometric algebra. Those 
interested solely in applications to space and spacetime may want to skip some 
of the material below, as both of these algebras are treated in a self-contained 
manner in chapters 2 and 5 respectively. The material on frames and linear al- 
gebra is important, however, and a knowledge of this is assumed for applications 
in gravitation. 

The fact that geometric algebra can be applied in spaces of arbitrary dimen- 
sions is crucial to the claim that it is a mathematical tool of universal applica- 
bility. The framework developed here will enable us to extend geometric algebra 
to the study of relativistic dynamics, phase space, single and multiparticle quan- 
tum theory, Lie groups and manifolds. This chapter also highlights some of the 
new algebraic techniques we now have at our disposal. Many derivations can be 
simplified through judicious use of the geometric product at various intermediate 
steps. This is true even if the initial and final expressions contain only inner and 
outer products. 

Many key relations in physics involve linear mappings between one space and 
another. In this chapter we also explore how geometric algebra simplifies the 
rich subject of linear transformations. We start with simple mappings between 
vectors in the same space and study their properties in a very general, basis-free 
framework. In later chapters this framework is extended to encompass functions 
between different spaces, and multilinear functions where the argument of the 
function can consist of one or more multivectors. 
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4.1 Axiomatic development 


We should now have an intuitive feel for the elements of a geometric algebra 
— the multivectors — and some of their multiplicative properties. The next 
step is to define a set of axioms and conventions which enable us to efficiently 
manipulate them. Geometric algebra can be defined using a number of axiomatic 
frameworks, all of which give rise to the same final algebra. In the main we 
will follow the approach first developed by Hestenes and Sobczyk and raise the 
geometric product to primary status in the algebra. The properties of the inner 
and outer products are then inherited from the full geometric product, and this 
simplifies proofs of a number of important results. 

Our starting point is the vector space from which the entire algebra will be 
generated. Vectors (i.e. grade-1 multivectors) have a special status in the algebra, 
as the grading of the algebra is determined by them. Three main axioms govern 
the properties of the geometric product for vectors. 


(i) The geometric product is associative: 
a(bc) = (ab)c = abe. (4.1) 
(ii) The geometric product is distributive over addition: 
a(b + c) = ab + ac. (4.2) 
(iii) The square of any vector is a real scalar: a? € R. 


The final axiom is the key one which distinguishes a geometric algebra from a 
general associative algebra. We do not force the scalar to be positive, so we can 
incorporate Minkowski spacetime without modification of our axioms. Nothing 
is assumed about the commutation properties of the geometric product — matrix 
multiplication is one picture to keep in mind. Indeed, one can always represent 
the geometric product in terms of products of suitably chosen matrices, but this 
does not bring any new insights into the properties of the geometric product. 

By successively multiplying together vectors we generate the complete algebra. 
Elements of this algebra are called multivectors and are usually written in upper- 
case italic font. The space of multivectors is linear over the real numbers, so if A 
and u are scalars and A and B are multivectors AA + uB is also a multivector. 
We only consider the algebra over the reals as most occurrences of complex 
numbers in physics turn out to have a geometric origin. This geometric meaning 
can be lost if we admit a scalar unit imaginary. Any multivector can be written 
as a sum of geometric products of vectors. They too can be multiplied using the 
geometric product and this product inherits properties (i) and (ii) above. So, 
for multivectors A, B and C, we have 


(AB)C = A(BC) = ABC (4.3) 
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and 


A(B+C) = AB + AC. (4.4) 
If we now form the square of the vector a + b we find that 
(a+b)? =(a+b)(a+b) =a? + ab + ba + b. (4.5) 
It follows that the symmetrised product of two vectors can be written 
ab + ba = (a+ b)? — a? — B?, (4.6) 


and so must also be a scalar, by axiom (iii). We therefore define the inner 
product for vectors by 


a-b = 5 (ab+ ba). (4.7) 


The remaining, antisymmetric part of the geometric product is defined as the 
exterior product and returns a bivector, 
1 


b= 
aN 5 


(ab — ba). (4.8) 
These definitions combine to give the familiar result 
ab=a-b+ a/b. (4.9) 


In forming this decomposition we have defined both the inner and outer products 

of vectors in terms of the geometric product. This contrasts with the common 
alternative of defining the geometric product in terms of separate inner and 
outer products. Some authors prefer this alternative because the (less famil- 
iar) geometric product is defined in terms of more familiar objects. The main 
drawback, however, is that work still remains to establish the main properties of 
the geometric product. In particular, it is far from obvious that the product is 
associative, which is invaluable for its use. 


4.1.1 The outer product, grading and bases 


In the preceding we defined the outer product of two vectors and asserted that 
this returns a bivector (a grade-2 multivector). This is the key to defining the 
grade operation for the entire algebra. To do this we first extend the definition of 
the outer product to arbitrary numbers of vectors. The outer (exterior) product 
of the vectors aj,...,@, is denoted by a, A a2 A++- A a, and is defined as the 
totally antisymmetrised sum of all geometric products: 


1 
ay /Aagh:::Aa, = (Dan ap, “+ Akp: (4.10) 
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The sum runs over every permutation ky,...,k, of 1,...,r, and (—1)* is 
+1 or —1 as the permutation k,,...,k, is even or odd respectively. So, for 
example, 
1 
a, /ag = zj (2142 a201) (4.11) 


as required. 

The antisymmetry of the outer product ensures that it vanishes if any two 
vectors are the same. It follows that the outer product vanishes if the vectors 
are linearly dependent, since in this case one vector can be written as a linear 
combination of the remaining vectors. The outer product therefore records the 
dimensionality of the object formed from a set of vectors. This is precisely what 
we mean by grade, so we define the outer product of r vectors as having grade r. 
Any multivector which can be written purely as the outer product of a set of 
vectors is called a blade. Any multivector can be expressed as a sum of blades, 
as can be verified by introducing an explicit basis. These blades all have definite 
grade and in turn define the grade or grades of the multivector. 

We rarely need the full antisymmetrised expression when studying blades. In- 
stead we can employ the result that every blade can be written as a geometric 
product of orthogonal, anticommuting vectors. The anticommutation of orthog- 
onal vectors then takes care of the antisymmetry of the product. In Euclidean 
space this result is simple to prove using a form of Gram-Schmidt orthogonali- 
sation. Given two vectors a and b we form 


b = b— Da. (4.12) 
We then see that 
aN\(b— Aa) = a^b — àa ^a = a ^b. (4.13) 


So the same bivector is obtained, whatever the value of (figure 4.1). The 
bivector encodes an oriented plane with magnitude determined by the area. 
Interchanging b and 0’ changes neither the orientation nor the magnitude, so 
returns the same bivector. We now form 


a-b' = a-(b— àa) = a-b — da’. (4.14) 
So if we set \ = a-b/a? we have a-b' = 0 and can write 
anb SONU = ab. (4.15) 


One can continue in this manner and construct a complete set of orthogonal 
vectors generating the same outer product. 
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b 


Figure 4.1 The Gram-Schmidt process. The outer product a A^ b is in- 
dependent of shape of the parallelogram formed by a and b. The only 
information contained in a A^ b is the oriented plane and a magnitude. The 
vectors b and b’ generate the same bivector, so we can choose b’ orthogonal 
to a and write aA b = ab’. 


An alternative form for b’ is quite revealing. We write 
V =b-a a-b 
= a`! (ab — a-b) 
=at (anb). (4.16) 
This shows that b’ is formed by rotating a through 90° in the a ^ b plane, and 
dilating by the appropriate amount. The algebraic form also makes it clear why 
ab' = a ^ b, and gives a formula that extends simply to higher grades. 

The above argument is fine for Euclidean space, but breaks down for spaces of 
mixed signature. The inverse a~' = a/a? is not defined when a is null (a? = 0), 
so an alternative procedure is required. Fortunately this is a relatively straight- 
forward exercise. We start with the set of r independent vectors a ,,...,a, and 
form the r x r symmetric matrix 


Mi; = Aj°aj. (4.17) 


The symmetry of this matrix ensures that it can always be diagonalised with an 
orthogonal matrix Rij, 


RikMaiRi; = RigRjiMgi = Aig. (4.18) 


Here Aj; is diagonal and, unless stated otherwise, the summation convention is 
employed. The matrix R,; defines a new set of vectors via 


€i = Rijaj. (4.19) 
These satisfy 


exe; = (Rixag)-(Rjiaz) 


= Ajj. (4.20) 
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The vectors €),..., erp are therefore orthogonal and hence all anticommute. Their 
geometric product is therefore totally antisymmetric, and we have 


E12 tt Cr = €1A:: Ae, 
= (Rai) ^- ++ (Ree ax) 
= det (Res) ay a2: Aar. (4.21) 


The determinant appears here because of the total antisymmetry of the expres- 
sion (see section 4.5.2). But since R;; is an orthogonal matrix it has determinant 
+1, and by choosing the order of the {e;} vectors appropriately we can set the 
determinant of R;; to 1. This ensures that we can always find a set of vectors 
such that 


a, AdgA:++Ady = €1€2°** Ep. (4.22) 


This result will simplify the proofs of a number of results in this chapter. 

For a given vector space, an orthonormal frame {e;},2 = 1,...,n provides a 
natural way to view the entire geometric algebra. We denote this algebra Gn. 
Most of the results derived in this chapter are independent of signature, so in the 
following we let Gn denote the geometric algebra of a space of dimension n with 
arbitrary (non-degenerate) signature. One can also consider the degenerate case 
where some of the basis vectors are null, though we will not need such algebras 
in this book. The basis vectors build up to form a basis for the entire algebra as 


1, éi, eje; (i < 3); €j€jEek (i <j < k), Bees (4.23) 
The fact that the basis vectors anticommute ensures that each product in the 
basis set is totally antisymmetric. The product of r distinct basis vectors is 
then, by definition, a grade-r multivector. The basis (4.23) therefore naturally 
defines a basis for each of the grade-r subspaces of G,,. We denote each of these 
subspaces by G7. The size of each subspace is given by the number of distinct 
combinations of r objects from a set of n. (The order is irrelevant, because of 
the total antisymmetry.) These are given by the binomial coefficients, so 


dim(G”) = C) ' (4.24) 
For example, we have already seen that in two dimensions the algebra contains 
terms of grade 0, 1,2 with each space having dimension 1,2,1 respectively. Simi- 
larly in three dimensions the separate graded subspaces have dimension 1,3,3,1. 
The binomial coefficients always exhibit a mirror symmetry between the r and 
n — r terms. This gives rise to the notion of duality, which is explained in sec- 
tion 4.1.4 where we explore the properties of the highest grade element of the 
algebra — the pseudoscalar. 


89 


FOUNDATIONS OF GEOMETRIC ALGEBRA 


The total dimension of the algebra is 
4 “(n 
dim(G,,) = dim(G/,) = () -a+ =r. 4.25 
(Gs) = Soaim(@s) = (") -0+1 (4.5) 

One can see that the total size of the algebra quickly becomes very large. If 
one wanted to find a matrix representation of the algebra, the matrices would 
have to be of the order of 2”/? x 2”/?, For all but the lowest values of n these 
matrices become totally impractical for computations. This is one reason why 
matrix representations do not help much with understanding and using geometric 
algebra. 

We have now defined the grade operation for our linear space Gn. An arbitrary 
multivector A can be decomposed into a sum of pure grade terms 


A= (Ajo + (Aji += > (Ap. (4.26) 
r 
The operator ( }» projects onto the grade-r terms in the argument, so (A), 
returns the grade-r components in A. Multivectors containing terms of only one 
grade are called homogeneous. They are often written as A,, so 


(Arr = Ar. (4.27) 


Take care not to confuse the grading subscript in A, with frame indices in expres- 
sions like {ex}. The context should always make clear which is intended. The 
grade-0 terms in G,, are the real scalars and commute with all other elements. 
We continue to employ the useful abbreviation 


(A) = (A)o (4.28) 


for the operation of taking the scalar part. 

An important feature of a geometric algebra is that not all homogeneous mul- 
tivectors are pure blades. This is confusing at first, because we have to go to four 
dimensions before we reach our first counterexample. Suppose that {e1,..., e4} 
form an orthonormal basis for the Euclidean algebra G4. There are six inde- 
pendent basis bivectors in this algebra, and from these we can construct terms 
like 

B = aei ^ez + Beg Aes, (4.29) 


where a and @ are scalars. B is a pure bivector, so is homogeneous, but it cannot 
be reduced to a blade. That is, we cannot find two vectors a and b such that 
B=aJb. The reason is that e; A eg and e3 A e4 do not share a common vector. 
This is not possible in three dimensions, because any two planes with a common 
origin share a common line. A four-dimensional bivector like B is therefore hard 
for us to visualise. There is a way to visualise B in three dimensions, however, 
and it is provided by projective geometry. This is described in chapter 10. 
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4.1.2 Further properties of the geometric product 


The decomposition of the geometric product of two vectors into a scalar term 
and a bivector term has a natural extension to general multivectors. To establish 
the results of this section we make repeated use of the formula 


ab = 2a-b — ba (4.30) 


which we use to reorder expressions. As a first example, consider the case of a 
geometric product of vectors. We find that 


GQ102°+* Ap = 24-01 a2: Ap — 1002 +++ Gy 


= 2a-a1 G2°++ Ap — 20-2 Q143: `: Ap + 4102003 +++ Ap 
5 
= k+1 x r 
= 2X (-1) a-ak a102- Gp i ar + (—1)" aiaz: apa, (4.31) 
k=1 


where the check on Gy, denotes that this term is missing from the series. We 
continue to follow the conventions introduced in chapter 2 so, in the absence 
of brackets, inner products are performed before outer products, and both are 
performed before geometric products. 

Suppose now that the vectors a1, ..., ap are replaced by a set of anticommuting 
vectors €1,..., €r. We find that 

1 r “ k+1 z 

5 (aerez -++e@, — (—1)"eiez: -> era) = 51) a-Ek €1€Q°++ Ege er. (4.32) 

k=1 

The right-hand side contains a sum of terms formed from the product of r — 1 
anticommuting vectors, so has grade r—1. Since any grade-r multivector can be 
written as a sum of terms formed from anticommuting vectors, the combination 
on the left-hand side will always return a multivector of grade r—1. We therefore 
define the inner product between a vector a and a grade-r multivector A, by 


aA, = (aA, = (-1)"4,a). (4.33) 


The inner product of a vector and a grade-r multivector results in a multivector 
with grade reduced by one. 

The main work of this section is in establishing the properties of the remaining 
part of the product aA,. For the case where A, is a vector, the remaining term 
is the antisymmetric product, and so is a bivector. This turns out to be true 
in general — the remaining part of the geometric product returns the exterior 
product, 


1 
5 (alanas: --Aay) + (-1)"(arAagA:- ‘Aar)a) = a^a ^a2^ Aar. (4.34) 
We will prove this important result by induction. First, we write the blade as a 
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geometric product of anticommuting vectors, so that the result we will establish 
becomes 


1 
5 (acres -ep + (-1)"e1e2°-- era) = a Ne] Nea ^+ Nep. (4.35) 


For r = 1 the result is true as the right-hand side defines the bivector a ^ e1. For 
r > 1 we proceed by writing 


1 
ae; Aeg Aer = —~ae1€2°°: er 
r 


+1 
1 r 
TA N O(-1)řep(aner A AČA Aer). (4.36) 
k=1 


This result is easily established by writing out all terms in the full antisymmetric 
product and gathering together the terms which start with the same vector. Next 
we assume that equation (4.35) holds for the case of an r — 1 blade, and expand 
the term inside the sum as follows: 


T. 


So (-1)hex(aerA-+ AČA Aer) 
k=1 
Tr 


1 
= (—1)*e, (ae, Saa Čk EE + (1) te; PA Čk ees era) 
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(—1)"e1 +- eva, (4.37) 


where we have used equation (4.32). Substituting this result into equation (4.36) 
then proves equation (4.35) for a grade-r blade, assuming it is true for a blade 
of grade r — 1. Since the result is already established for r = 1, equation (4.34) 
holds for all blades and hence all multivectors. 

We extend the definition of the wedge symbol by writing 


1 rT 
aA, = 5 (a4, +(-1) Ana), (4.38) 
With this definition we now have 
aA, = a: A,+aNAyr, (4.39) 


which extends the decomposition of the geometric product in precisely the de- 
sired way. In equation (4.38) one can see how the geometric product can simplify 
many calculations. The left-hand side would, in general, require totally antisym- 
metrising all possible products. But the right-hand side only requires evaluating 
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two products — an enormous saving! As we have established the grades of the 
separate inner and outer products, we also have 


aA, = (aAp)r—1 = (aAr)r+1, (4.40) 
where 
a: Ar = (aAr)r-1, a^Ar = (Ar) 41 (4.41) 


So, as expected, multiplication by a vector raises and lowers the grade of a 
multivector by 1. 

A homogeneous multivector can be written as a sum of blades, and each blade 
can be written as a geometric product of anticommuting vectors. Applying the 
preceding decomposition, we establish that the product of two homogeneous 
multivectors decomposes as 


A,Bs = (ABs) |r—s| (Ar Bs) |r—s|-+2 apren (ArBs}r+s- (4.42) 


We retain the - and A symbols for the lowest and highest grade terms in this 
series: 
Ay Bs = (ArBs)|r—s|s 


(4.43) 
Ar ABs = (ABs) r+s- 


This is the most general use of the wedge symbol, and is consistent with the 
earlier definition as the antisymmetrised product of a set of vectors. We can 
check that the outer product is associative by forming 


(A; ABs)ACt = (A, Bs) rts NC} = ((ArBs)Ct)r+s+t- (4.44) 


Associativity of the outer product then follows from the fact that the geometric 
product is associative: 


((ArBs)Ct)}r+s+t = (Ar BsCt) r+s+t = A,;A\B; AC. (4.45) 


In equation (4.32) we established a formula for the result for the inner product 
of a vector and a blade formed from orthogonal vectors. We now extend this to 
a more general result that is extremely useful in practice. We start by writing 


a-(ayAagA---Aa,) = a (araz: -t Or) py (4.46) 
where a1,...,a@, are a general set of vectors. The geometric product a,aq--- a, 
can only contain terms of grade r, r — 2, ..., so 


1 
5 (aaa +d, — (-1)"ayag-:- ara) 


= a- (aiaz: -arr + a (araz ar)r-2 Fee. (4.47) 


The term we are after is the r — 1 grade part, so we have 


1 
a:(ayAagA---Aay) = z (24102 ++ arp — (—1)" aiaz: +- ara)r—1: (4.48) 
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We can now apply equation (4.31) inside the grade projection operator to form 


r 


a:(ayAagA- . Nar) = So(-1) a-ax(ar eee k ae Ap) p—1 


(—1) t a-ap a1 A++ AĂkA Aar: (4.49) 
k=1 


II 


The first two cases illustrate how the general formula behaves: 


a- (a1 Adz) = a-a; a2 — a -a2 44, (4.50) 
a- (a ^a2^a3) = a-aı a2 ^03 — aaz a, ^a3 + a-a3 a1 Aa2. 


The first case was established in chapter 2, where it was used to replace the 
formula for the double cross product of vectors in three dimensions. 


4.1.3 The reverse, the scalar and the commutator product 


Now that the grading is established, we can establish some general properties of 
the reversion operator, which was first introduced in chapter 2. The reverse of a 
product of vectors is defined by 


(ab---c)t = c- -- ba. (4.51) 


For a blade the reverse can be formed by a series of swaps of anticommuting 
vectors, each resulting in a minus sign. The first vector has to swap past r — 1 
vectors, the second past r — 2, and so on. This demonstrates that 


At = Gi YA. (4.52) 


If we now consider the scalar part of a geometric product of two grade-r multi- 
vectors we find that 


(ArBr) = (ArBry)! = (BRA}) = (-1)"" (By Ar) = (BrAr), (4-53) 
so, for general A and B, 
(AB) = (BA). (4.54) 
It follows that 
(A--- BC) = (CA---B). (4.55) 


This cyclic reordering property is frequently useful for manipulating expressions. 
The product in equation (4.54) is sometimes given the symbol *, so we write 


AxB = (AB). (4.56) 
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A further product of considerable importance in geometric algebra is the com- 
mutator product of two multivectors. This is denoted with a cross, x, and is 
defined by 


Ax B= -(AB — BA). (4.57) 


1 
2 
Care must be taken to include the factor of one-half, which is different to the 
standard commutator of two operators in quantum mechanics. The commutator 
product satisfies the Jacobi identity 


Ax(BxC)+ Bx(Cx A)+Cx(AxB) =0, (4.58) 


which is easily seen by expanding out the products. 
The commutator arises most frequently in equations involving bivectors. Given 
a bivector B and a vector a we have 


1 
Bxa=5(Ba - aB) = B-a, (4.59) 


which therefore results in a second vector. Now consider the product of a bivector 
and a blade formed from anticommuting vectors. We have 


B(e1e2--- ep) = 2(B xe1)ez: -ep +e, Beg: -er 


=2(Bxe1)ez: -er +++ +2e1-- (BXe,) +e1e2--€rB. (4.60) 
It follows that 
Bx (erez: ep) = X ey +++ (Brei) er (4.61) 
i=1 


The sum involves a series of terms which can only contain grades r and r — 2. 
But if we form the reverse of the commutator product between a bivector and a 
homogeneous multivector, we find that 

vt 
2 


= Ż(-A}B + BA}) 


(Bx A,)' = -(BA, — A-B) 


= (—1)"079/2 B x A,. (4.62) 


It follows that B x A, has the same properties under reversion as Ap. But 
multivectors of grade r and r — 2 always behave differently under reversion. 
The commutator product in equation (4.61) must therefore result in a grade-r 
multivector. Since this is true of any grade-r basis element, it must be true of 
any homogeneous multivector. That is, 


BxA, = (Bx Ar)r. (4.63) 


The commutator of a multivector with a bivector therefore preserves the grade 
of the multivector. Furthermore, the commutator of two bivectors must result 
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in a third bivector. This is the basis for incorporating the theory of Lie groups 
into geometric algebra. 

A similar argument to the preceding one shows that the symmetric product 
with a bivector must raise or lower the grade by 2. We can summarise this by 
writing 


BA, = (BAr)r-2 F (BAr)r F (BAr)r+2 


= B-A + Bx A r+ B^Ar, (4.64) 
where 
5 (BA, — A,B) = Bx A, (4.65) 
and 
5(BA, + A,B) = B-A, + BAA,. (4.66) 


It is assumed in these formulae that A, has grade r > 1. 


4.1.4 Pseudoscalars and duality 


The exterior product of n vectors defines a grade-n blade. For a given vector 
space the highest grade element is unique, up to a magnitude. The outer product 
of n vectors is therefore a multiple of the unique pseudoscalar for Gn. This is 
denoted J, and has two important properties. The first is that J is normalised 
to 


|7?| =1. (4.67) 


The sign of I? depends on the size of space and the signature. It turns out that 
the pseudoscalar squares to —1 for the three algebras of most use in this book 
— those of the Euclidean plane and space, and of spacetime. But this is in no 
way a general property. 

The second property of the pseudoscalar I is that it defines an orientation. 
For any ordered set of n vectors, their outer product will either have the same 
sign as I, or the opposite sign. Those with the same sign are assigned a positive 
orientation, and those with opposite sign have a negative orientation. The ori- 
entation is swapped by interchanging any pair of vectors. In three dimensions 
we always choose the pseudoscalar J such that it has the orientation specified by 
a right-handed set of vectors. In other spaces one just asserts a choice of J and 
then sticks to that choice consistently. 

The product of the grade-n pseudoscalar J with a grade-r multivector A, is 
a grade n — r multivector. This operation is called a duality transformation. If 
A, is a blade, IA, returns the orthogonal complement of A,. That is, the blade 
formed from the space of vectors not contained in A,. It is clear why this has 
grade n — r. Every blade acts as a pseudoscalar for the space spanned by its 
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generating vectors. So, even if we are working in three dimensions, we can treat 
the bivector e1e2 as a pseudoscalar for any manipulation taking place entirely in 
the e,e2 plane. This is often a very helpful idea. 

In spaces of odd dimension, J commutes with all vectors and so commutes with 
all multivectors. In spaces of even dimension, J anticommutes with vectors and 
so anticommutes with all odd-grade multivectors. In all cases the pseudoscalar 
commutes with all even-grade multivectors in its algebra. We summarise this by 


TA, = (—1)"-Y A, I. (4.68) 


An important use of the pseudoscalar is for interchanging inner and outer prod- 
ucts. For example, we have 


a-(A,I) = (aanl = (-1)"""A,Ia) 
= (anI (-1)"-"( 1)" Aral) 
1 


= =( A. + (1) A,a)I 
= aA, I. (4.69) 


More generally, we can take two multivectors A, and Bs, with r+s < n, and 
form 


Ar: (Bs) = (Ar Bs!) \r_(n—s)| 
= (A, BsI)n—(rts) 
= (A Bs)r+s{ 
= A, AB, I. (4.70) 


This type of interchange is very common in applications. Note how simple this 
proof is made by the application of the geometric product in the intermediate 
steps. 


4.2 Rotations and reflections 


In chapter 2 we showed that in three dimensions a reflection in the plane per- 
pendicular to the unit vector n is performed by 


ata’ = —nan. (4.71) 


This formula holds in arbitrary numbers of dimensions. Provided n? = 1, we see 
that n is transformed to 


nr nnn = =n, (4.72) 
whereas any vector a, perpendicular to n is mapped to 


aj = -nan =a; nn =a]. (4.73) 
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So, for a vector a, the component parallel to n has its sign reversed, whereas 
the component perpendicular to n is unchanged. This is what we mean by a 
reflection in the hyperplane perpendicular to n. 

Two successive reflections in the hyperplanes perpendicular to m and n result 
in a rotation in the m A n plane. This is encoded in the rotor 


R = nm = exp(—B6/2) (4.74) 
where 
^ mAN 
cos(0/2)=n-m, B= sin(0/2)" (4.75) 


The rotor R generates a rotation through the by now familiar formula 
ata’ = Rakt. (4.76) 


Rotations form a group, as the result of combining two rotations is a third 
rotation. The same must therefore be true of rotors. Suppose that Rı and Rə 
generate two distinct rotations. The combined rotations take a to 


a> Ro(RaR!)R} = RoR aR! R}. (4.77) 
We therefore define the product rotor 
R= RR, (4.78) 


so that the result of the composite rotation is described by RaRt, as usual. The 
product R is a new rotor, and in general it will consist of geometric products of 
an even number of unit vectors, 


R=lk. nm. (4.79) 
We will adopt this as our definition of a rotor. The reversed rotor is 
Rt =mn.--kl. (4.80) 
The result of the map a> RaRt returns a vector for any vector a, since 
RaR? = Ik--- (n(mam)n) «+ kl (4.81) 


and each successive sandwich between a vector returns a new vector. 
We can immediately establish the normalisation condition 


RR = 1k---nmmn---kl=1= RİR. (4.82) 


In Euclidean spaces, where every vector has a positive square, this normalisation 
is automatic. In mixed signature spaces, like Minkowski spacetime, unit vectors 
can have n? = +1. In this case the condition RR' = 1 is taken as a further 
condition satisfied by a rotor. In the case where R is the product of two rotors 
we can easily confirm that 


RR‘ = R,Ri(R2R1)t = RR, Rİ Rİ = 1. (4.83) 
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The set of rotors therefore forms a group, called a rotor group. This is similar to 
the group of rotation matrices, though not identical due to the two-to-one map 
between rotors and rotation matrices. We will have more to say about the group 
properties of rotors in chapter 11. 

In Euclidean spaces every rotor can be written as the exponential of a bivector, 


R = exp(—B/2). (4.84) 


The bivector B defines the plane or planes in which the rotation takes place. 
The sign ensures that the rotation has the orientation defined by B. In mixed 
signature spaces one can always write a rotor as +exp(B). In either case the 
effect of the rotor R on the vector a is 


a exp(—B/2)aexp(B/2). (4.85) 


We can prove that the right-hand side always returns a vector by considering a 
Taylor expansion of 


a(XA) = exp(—AB/2)aexp(AB/2). (4.86) 

Differentiating the expression on the right produces the power series expansion 
2 

a(A) =a+rAa-B+ gy (eB) B+ : (4.87) 


Since the inner product of a vector and a bivector always results in a new vector, 
each term in this expansion is a vector. Setting 4 = 1 then demonstrates that 
equation (4.85) results in a new vector, defined by 


exp(—B/2)aexp(B/2) =a+a-B+ 5(0-B):B Heeg (4.88) 


4.2.1 Multivector transformations 


Suppose now that every vector in a blade undergoes the same rotation. This is 
the sort of transformation implied if a plane or volume element is to be rotated. 
The r-blade A, can be written 


1 € 
Ar = a A+ Aar = =D (-1) Gk, ks °° kns (4.89) 


with the sum running over all permutations. If each vector in a geometric product 
is rotated, the result is the multivector 


(Ra, R')(Ra2R')---(Ra,R!) = Ra, RÍ Rao R' --- Ra, RI 
= Raaz- -a RÙ. (4.90) 


This holds for each term in the antisymmetrised sum, so the transformation law 
for the blade A, is simply 


A, Al = RA, RÌ. (4.91) 
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Blades transform with the same simple law as vectors! All multivectors share 
the same transformation law regardless of grade when each component vector 
is rotated. This is one reason why the rotor formulation is so powerful. The 
alternative, tensor form would require an extra matrix for each additional vector. 


4.3 Bases, frames and components 


Any set of linearly independent vectors form a basis for the vectors in a geometric 
algebra. Such a set is often referred to as a frame. Repeated use of the outer 
product then builds up a basis for the entire algebra. In this section we use the 
symbols e),...,@, or {ex} to denote a frame for n-dimensional space. We do not 
restrict the frame to be orthonormal, so the {ex} do not necessarily anticommute. 
The reason for the change of font for frame vectors, as opposed to general sets of 
vectors, is that use of frames nearly always implies reference to coordinates. It 
is natural write the coordinates of the vector a as a; or a so, to avoid confusion 
with a set of vectors, we write the frame vectors in a different font. 
The volume element for the {ex} frame is defined by 


En =e Aegh:::Aen. (4.92) 


The grade-n multivector En is a multiple of the pseudoscalar for the space 
spanned by the {e,}. The fact that the vectors are independent guarantees 
that En # 0. Associated with any arbitrary frame is a reciprocal frame {e*} 
defined by the property 


ee; = 6), Vi,g=l...n. (4.93) 


The ‘Kronecker ô’, ô$, has value +1 if i = j and is zero otherwise. The reciprocal 
frame is constructed as follows: 


ef = (1) le] Neg A+: ABA: Aen Ezt, (4.94) 


where as usual the check on é; denotes that this term is missing from the ex- 
pression. The formula for ef has a simple interpretation. The vector ef must 
be perpendicular to all the vectors {e;,i 4 j }. To find this we form the exte- 
rior product of the n — 1 vectors {e;,i 4 j}. The dual of this returns a vector 
perpendicular to all vectors in the subspace, and this duality is achieved by the 
factor of En. All that remains is to fix up the normalisation. For this we recall 
the duality results of section 4.1.4 and form 


ey-e' = e1 (e2^- Aen E,,') = (1 Ae2A^ Aen) Ent = 1. (4.95) 
This confirms that the formula for the reciprocal frame is correct. 
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€3 


e1 A\e2 


Figure 4.2 The reciprocal frame. The vectors e1, e2 and e3 form a non- 
orthonormal frame for three-dimensional space. The vector e? is formed 
by constructing the e; ^ e2 plane, and forming the vector perpendicular to 
this plane. The length is fixed by demanding e?-e3 = 1. 


4.3.1 Application — crystallography 


An important application of the formula for a reciprocal frame is in crystal- 
lography. If a crystal contains some repeated structure defined by the vectors 
€1,€2,€3, then constructive interference occurs for wavevectors whose difference 
satisfies 


Ak = 2n(nye! + nze? + nge’), (4.96) 
where n1, 72,3 are integers. The reciprocal frame is defined by 
eo/e e3^e &^e 
el — eae arcs 3 ; e2 = oe 1 5 e3 = CS 2 . (4.97) 
ey Aen ^63 ey Aen ^63 ey Aen ^63 
If we write 
e1 Aen ^63 = e1, e2, €3]Z, (4.98) 


where I is the three-dimensional pseudoscalar and [e1, €2,€3] denotes the scalar 
triple product, we arrive at the standard formula 


ip (e2^e3) I7} _  €2X€3 


= À 4.99 
[e1, e2, e3] [e1, €2, €3] ( ) 


with similar results holding for e? and e. Here the bold cross x denotes the vec- 
tor cross product, not to be confused with the commutator product. Figure 4.2 
illustrates the geometry involved in defining the reciprocal frame. 


4.3.2 Components 


The basis vectors {ex} are linearly independent, so any vector a can be written 
uniquely in terms of this set as 


a = ae; = qet. (4.100) 
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We continue to employ the summation convention and summed indices appear 
once as a superscript and once as a subscript. The set of scalars (at,...,a”) are 
the components of the vector a in the {ex} frame. To find the components we 
form 


a-e = ale;-e' =a) di =a! (4.101) 
and 
ace; = ajel -e; = ajl = li. (4.102) 


These formulae explain the labelling scheme for the components. In many ap- 
plications we are only interested in orthonormal frames in Euclidean space. In 
this case the frame and its reciprocal are equivalent, and there is no need for 
the distinct subscript and superscript indices. The notation is unavoidable in 
mixed signature spaces, however, and is very useful in differential geometry, so 
it is best to adopt it at the outset. 

Combining the equations (4.100), (4.101) and (4.102) we see that 


a-ee' = at e =a. (4.103) 


This holds for any vector a in the space spanned by the {ex}. This result 
generalises simply to arbitrary multivectors. First, for the bivector a^b we have 


e; e’-(aAb) = e; ct-a b — e; et -ba = ab — ba = 2a ^b. (4.104) 
This extends for an arbitrary grade-r multivector A, to give 
ei- A, = TAr. (4.105) 
Since e;ef = n, we also see that 
e;e’AA, = e;(e'A, — e- Ar) = (n — r) Ap. (4.106) 
Subtracting the two preceding results we obtain, 
e; Are = (—1)"(n — 2r) Ap. (4.107) 


The {ex} basis extends easily to provide a basis for the entire algebra generated 
by the basis vectors. We can then decompose any multivector A into a set of 
components through 


Aj...jk = ((ek Ae; eee -Ae;) A). (4.108) 
and 
A= SO Ayk Adae. (4.109) 
i<j- <k 


The components A,,..., are totally antisymmetric on all indices and are usually 
referred to as the components of an antisymmetric tensor. We shall have more 
to say about tensors in following sections. 
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4.3.8 Application — recovering a rotor 


As an application of the preceding results, suppose that we have two sets of 
vectors in three dimensions {e,} and {fẹ}, k = 1,2,3. The vectors need not 
be orthonormal, but we know that the two sets are related by a rotation. The 
rotation is governed by the formula 


fe = Re R! (4.110) 


and we seek a simple expression for the rotor R. In three dimensions the rotor 
R can be written as 


R = exp(—B/2) = a — GB, (4.111) 
where 
garea BE HEO (4.112) 
The reverse is 
RÝ = exp(B/2) = a + GB. (4.113) 
We therefore find that 
ep Rie® = e(a + GB)e* 
= 3a — 6B 
= 4a — Rt, (4.114) 
We now form 
fe? = Re, R'e" = 4aR — 1. (4.115) 


It follows that R is a scalar multiple of 1 + f,e*. We therefore establish the 
simple formula 
= 1+ fe” = Y 
Lfe] (ph) 
where ù = 1 + fe. This compact formula recovers the rotor directly from 
the frame vectors. A problem arises if the rotation is through precisely 180°, in 
which case w vanishes. This case can be dealt with simply enough by considering 
the image of two of the three vectors. 


R 


(4.116) 


4.4 Linear algebra 


Many key relations in physics involve linear mappings between two, sometimes 
different, spaces. These are the subject of tensor analysis in the standard litera- 
ture. Examples include the stress and strain tensors of elasticity, the conductivity 
tensor of electromagnetism and the inertia tensor of dynamics. If one has only 
met the study of linear transformations through tensor analysis, one could be 
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forgiven for thinking that the subject cannot be discussed without a large dose 
of index notation. The indices refer to components of tensors in some frame, 
though the essence of tensor analysis is to establish a set of results which are 
independent of the choice of frame. In our opinion, this subject is much more 
simply dealt with if one can avoid specifying a frame until it is absolutely neces- 
sary. Perhaps unsurprisingly, it is geometric algebra that provides precisely the 
tools necessary to achieve such a development. 

In this section we use capital, sans-serif symbols for linear functions. This helps 
to distinguish functions from their multivector argument. The dimension and 
signature of the vector space is arbitrary unless otherwise specified. We assume 
that readers are familiar with the basic properties of linear transformations in 
the guise of matrices. Suppose, then, that we are interested in a quantity F 
which maps vectors to vectors linearly in the same space. That is, if a is a vector 
in the space acted on by F, then F(a) lies in the same space. The linearity of F 
is expressed by 


F(\a + ub) = AF(a) + uF (b), (4.117) 
for scalars À and u and vectors a and b. Geometrically, we can think of F as an 
instruction to take a vector and rotate/dilate it to a new vector. No frame or 


components are required for such a picture. A simple example is provided by a 
rotation, which can be written as 


R(a) = RaR’, (4.118) 


where R is a rotor. It is a simple matter to confirm that this map is linear. 


4.4.1 Extension to multivectors 


Once one has formulated the action of a linear function on a vector, the obvious 
next step is to let the function act on a multivector. In this way we extend the 
action of a linear function to the full geometric algebra defined by the underlying 
vector space. Suppose that two vectors a and b are acted on by the linear 
function F. The bivector a A b then transforms to F(a) A F(b). We take this as 
the definition for the action of F on a bivector blade: 


F(aAb) = F(a) AF(b). (4.119) 


Since the right-hand side is the outer product of two vectors, it is also a bivector 
blade (see figure 4.3). The action on sums of blades is defined by the linearity 
of F: 


F(aAb + cAd) = F(aAb) + F(cAd). (4.120) 
Continuing in this manner, we define the action of F on an arbitrary blade by 


F(aAbA---Ac) = F(a) AF(b) A: --AF(c). (4.121) 
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F(aAb) 


Figure 4.3 The extended linear function. The action of F on the bivector 
a ^b results in the new plane F(a) A F(b). This is the definition of F(a A b). 


Extension by linearity then defines the action of F on arbitrary multivectors. By 
construction, F is both linear over multivectors, 


F(AA + uB) = AF(A) + uF(B), (4.122) 
and grade-preserving, 


where A, is a grade-r multivector. A simple example is provided by rotations. 
We have already established a formula for the result of rotating all of the vectors 
in a blade. For the extension of a rotation we therefore have 


R(aAbA---Ac) = (RaR')A(RbR')A---A(ReR') 
= RaAba::-AcR'. (4.124) 
It follows that acting on an arbitrary multivector A we have 
R(A) = RAR’. (4.125) 


Again, it is simple to confirm that this has the expected properties. 


4.4.2 The product 


The product of two linear functions is formed by letting a second function act 
on the result of the first function. Thus the action of the product of F and G is 
defined by 


(FG)(a) = F(G(a)) = FG(a). (4.126) 


The final expression enables us to remove some brackets without any ambiguity. 
A price to pay for removing indices is that brackets are often required to show 
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how calculations are ordered. Any convention that enables brackets to be sys- 
tematically dropped is then well worth adopting. It is straightforward to show 
that FG is a linear function if F and G are both linear: 


FG(Aa + ub) = F(AG(a) + uG(b)) = AFG(a) + uFG(b). (4.127) 


Next we form the extension of a product function. Suppose that H is given by 
the product of F and G: 


H(a) = F(G(a)) = FG(a). (4.128) 
It follows that 
H(aAbA--+Ac) = F(G(a)) AF(G(b)) A- -- AF (G(c)) 


(G(a) AG(b) A- --AG(e)) 
(G(aAbA---Ac)), (4.129) 


so the multilinear action of the product of two linear functions is the product of 
their exterior actions. In dealing with combinations of linear functions we can 
therefore write 


H(A) = FG(A), (4.130) 


since the meaning of the right-hand side is unambiguous. 


4.4.3 The adjoint 
Given a linear function F, the adjoint, or transpose, F is defined so that 
a-F(b) = F(a)-b, (4.131) 


for all vectors a and b. If F is a mapping from one vector space to another, then 
the adjoint function maps from the second space back to the first. In terms of 
an arbitrary frame {ex} we have 


e::F(a) = a-F(e), (4.132) 
so we can construct the adjoint using 
ad(F) (a) = F(a) = ¢ a-F(e;). (4.133) 


The notation of a bar for the adjoint, rather than a superscript T or f, is slightly 
unconventional, though it does agree with that of Hestenes & Sobczyk (1984). 
The notation is very useful in handwritten work, where it is also convenient to 
denote the linear function with an underline. Some formulae relating functions 
and their adjoints have a neat symmetry when this overbar/underbar convention 
is followed. 
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The operation of taking the adjoint of the adjoint of a function returns the 
original function. This is verified by forming 


ad(F)(a) = e’a-F(e;) = e’ e;-F(a) = F(a). (4.134) 
The adjoint of a product of two functions is found as follows: 
ad(FG)(a) = e a-FG(e;) = F(a)-G(e;) e 
= GF (a)-e; e’ = GF(a). (4.135) 
The operation of taking the adjoint of a product therefore reverses the order 
in which the linear functions act. A symmetric function is one which is equal 
to its own adjoint, F = F. Two particularly significant examples of symmetric 


functions are the functions FF and FF. To verify that these are symmetric we 
form 


ad(FF) = ad(F)ad(F) = FF, (4.136) 
with a similar derivation holding for FF. These functions will be met again later 
in this chapter. 


The adjoint is still a linear function, so its extension to arbitrary multivectors 
is precisely as expected: 


F(aAbA---Ac) = F(a) AF(b)A---AF(c). (4.137) 
If we now consider two bivectors a; A az and bı A b2, we find that 
(a, Aaz)-F(b1 Ab) = a1-F(b2) ag-F(b1) — a1-F(b1) a2-F(b2) 
= F(a1)-b2 F(ag)-b, — F(a,)-by F(ag)-be 
= F(a, ^a2)- (b1 Abe). (4.138) 
It follows that for two bivectors Bı and Bo 
B,-F(Bz) = F(B,)- Bo. (4.139) 
This result extends for arbitrary multivectors to give 
(AF(B)) = (F(A)B). (4.140) 


This is a special case of an even more general and powerful result. Consider the 
expression 


) 
-F(c)). (4.141) 
Building up in this way we establish the useful results: 
A,-F(B,) = F(F(A,)-B, <s, 
al) a ae eae de (4.142) 
F(A,)-B, = F(A,-F(Bs)) r>s. 
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F(e1) 


Figure 4.4 The determinant. The unit cube is transformed to a par- 
allelepiped with sides F(e1), F(e2) and F(e3). The determinant is the 
volume scale factor, so is given by the volume of the parallelepiped, 
F(e1)AF(e2)AF(e3) = F(Z). 


These reduce to equation (4.140) in the case when r = s. One way to think 
of these formulae is as follows. In the expression F(A,)- Bs, with r > s, there 
are r separate applications of the function F on vectors. When the result is 
contracted with B,, s of these applications are converted to adjoint functions F. 
The remaining r — s applications act on the multivector A, - F(B,), which has 
grade r — s. 


4.4.4 The determinant 


Now that we have seen how a linear function defines an action on the entire 
geometric algebra, we can give a very compact definition of the determinant. 
The pseudoscalar for any space is unique up to scaling, and linear functions are 
grade-preserving, so we define 


F(I) = det (F) I. (4.143) 


It should be immediately apparent that this definition of the determinant is 
much more compact and intuitive than the matrix definition (discussed later). 
The definition (4.143) shows clearly that the determinant is the volume scale 
factor for the operation F. In particular, acting on the unit hypercube, the 
result F(Z) returns the directed volume of the resultant object (see figure 4.4). 
As an example of the power of the geometric algebra definition, consider the 
product of two functions, F and G. From equation (4.130) it follows that 


det (FG)I = FG(I) = det (G) F(T) = det (F) det (G) I, (4.144) 
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which establishes that the determinant of the product of two functions is the 
product of their determinants. This is one of the key properties of the deter- 
minant, yet in conventional developments it is hard to prove. By contrast, the 
geometric algebra approach establishes the result in a few lines. Similarly, one 
can easily establish that the determinant of the adjoint is the same as that of 
the original function, 


det (F) = (F(I)I~') = (IF(I~*)) = det (F). (4.145) 


Example 4.1 
Consider the linear function 


F(a) = a + aa: fi fo, (4.146) 


where a is a scalar and fı and f> are a pair of arbitrary vectors. Construct the 
action of F on a general multivector and find its determinant. 
We start by forming 


F(a^b) = (a + aa- fı f2) ^b + ab: fi fa) 
=a^b+ a(b- fia — a- fib) A fa 
= anb + a((anb): f1) A f2- (4.147) 


It follows that 
F(A) = A + af A- fi) A fa. (4.148) 


The determinant is now calculated as follows: 


F(T) =I+ a(l- fi)A fe 
=I+afi-fol, (4.149) 


hence det (F) = 1 + afi: fo. 


4.4.5 The inverse 


We now construct a simple, explicit formula for the inverse of a linear function. 
We start by considering a multivector B, lying entirely in the algebra defined by 
the pseudoscalar J. For these we have 


det (F)IB = F(I)B = F(IF(B)), (4.150) 


where we have used the adjoint formulae of equation (4.142). The inner product 
with a pseudoscalar is replaced with a geometric product, since no other grades 
are present in the full product. Replacing IB by A we find that 


det (F)A = F(IF(I~*A)) (4.151) 
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with a similar result holding for the adjoint. It follows that 
F-1(A) = IF(I~*A) det (F)~?, 


Zai A z (4.152) 
F-1{A) = IF(I7 1A) det (F)~2. 


These relations provide simple, explicit formulae for the inverse of a function. 
The derivation of these formulae is considerably quicker than anything available 
in traditional matrix/tensor analysis. 


Example 4.2 
Find the inverse of the function defined in equation (4.146). 
With 


F(A) =A+a(A-fi)Afo (4.153) 
we have 
(ArF(B,)) = (ABr) + a(Ar (Br: fi) A fe) 
= (Ar Br) + a(f2:ArBr f1), (4.154) 
hence 
F(A) = A+ afiA(fo-A). (4.155) 


It follows that 


Foi GA) = (I*A + afi A(fz:(I7*A))) (1 + afi: f2)! 
=(A+afi-(f2\A))(1+ afi: fa) 


ge eee aCe) (4.156) 


Example 4.3 
Find the inverse of the rotation 


R(a) = RaR, (4.157) 


where R is a rotor. 
We have already seen that the action of R on a general multivector is 


R(A) = RAR' and R(A)=R'AR (4.158) 
Hence 
det (R)I = RIR' = IRR' =], (4.159) 


so det (R) = 1. It follows that 
R-1(A) = IR'1I-1AR = R' AR = R(a), (4.160) 


so, as expected, the inverse of a rotation is the same as the adjoint. This is the 
definition of an orthogonal transformation. 
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4.4.6 Eigenvectors and eigenblades 


We assume that readers are familiar with the concept of an eigenvalue and eigen- 
vector of a matrix. All of the standard results for these have obvious counterparts 
in the geometric algebra framework. This subject will be explored more thor- 
oughly in chapter 11. Here we give a simple outline, concentrating on the new 
concepts that geometric algebra offers. A linear function F has an eigenvector e 
if 

F(e) = Ae. (4.161) 
The scalar À is the associated eigenvalue. It follows that 


det (F — Al) = 0, (4.162) 


which defines a polynomial equation for A. Techniques for finding eigenvalues 
and eigenvectors are discussed widely in the literature. 

In general, the polynomial equation for À will have complex roots. Traditional 
developments of the subject usually allow these and consider linear superposi- 
tions over the complex field. But if one starts with a real mapping between real 
vectors it is not clear that this formal complexification is useful. What one would 
like would be a more geometric classification of a general linear transformation. 
This is provided by the notion of an eigenblade. We extend the notion of an 
eigenvector to that of an eigenblade A, satisfying 


F(A,) = AA, (4.163) 


where A, is a grade-r blade and A is real. One immediate example is the 
pseudoscalar, for which A = det (F). More generally, each eigenblade determines 
an invariant subspace of the transformation. 

As an example of the geometric clarity of the eigenblade concept, consider a 
function satisfying 


F(e,) = rea, F(e2) = — Xe}. (4.164) 


Traditionally, one might write that e; + ie are eigenvectors with eigenvalues 
Fid, where i is the unit imaginary. But the identity 


F(e1^e2) = Ae Aeo (4.165) 


identifies the plane e; ^ez as an eigenbivector of F. The role of the complex 
structure inherent in F is played by the unit bivector e; ^ez. A linear function 
can have many distinct eigenbivectors, each acting as a distinct imaginary for 
its own plane. Replacing all of these by a single scalar imaginary throws away a 
considerable amount of useful information. 
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4.4.7 Symmetric and antisymmetric functions 


An important aspect of the theory of linear functions is finding natural, canon- 
ical? expressions for a function. For symmetric functions in Euclidean space 
this form is via its spectral decomposition. If e; and ej are eigenvectors of a 
function, with eigenvalues À; and Aj, we have (no sums implied) 


ei: F(e;) = ei: (Ajej) = Aji ej. (4.166) 


But if F is symmetric, this also equals 


F(e;)-e; = F(e;)-e; = (A;ei) ej = Vili Ej: (4.167) 
It follows that 
(A; = Aj) ei; = 0, (4.168) 


so eigenvectors of a symmetric function with distinct eigenvalues must be or- 
thogonal. 

If we admit the existence of complex eigenvectors and eigenvalues we also find 
that (no sums) 


e*-F(e) = re*-e = F(e*)-e = A*e*-e. (4.169) 
So for any symmetric function we also have 
(A — A*)e* -e = 0. (4.170) 


Provided e* - e # 0 we can conclude that the eigenvalue, and hence the eigen- 
vector, is real. In Euclidean space this inequality is always satisfied, and every 
symmetric function on an n-dimensional space has a spectral decomposition of 
the form 


F(a) = à1P1 (a) + A2P2(a) +--+ + AmPm(a). (4.171) 


Here \1 < Ag < +++ < Am are the m distinct eigenvalues (m < n) and the P; are 
projections onto each of the invariant subspaces defined by the eigenvectors. For 
the case of a projection onto a one-dimensional space we have simply 


Pi(a) = a-e; ei. (4.172) 


The eigenvectors form an orthonormal frame, which is the natural frame in which 
to study the linear function. If two eigenvalues are the same, it is always possible 
to choose the eigenvectors so that they remain orthogonal. In non-Euclidean 
spaces, such as spacetime, one has to be careful due to the possibility of complex 
null vectors. These can have e* - e = 0, so the above reasoning breaks down and 
f The origin of the use of the word canonical is obscure — see for example the comments in 


Goldstein (1950). In mathematical physics, a canonical form usually refers to a standard 
way of simplifying an expression without altering its meaning. 
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one cannot guarantee the existence of an orthonormal frame of eigenvectors. We 
will encounter examples of this when we study gravitation. 


Antisymmetric functions have F(a) = —F(a). It follows that 
a-F(a) = F(a)-a = —F(a)-a = 0. (4.173) 
The natural way to study antisymmetric functions is through the bivector 
F= Le'n F(e;), (4.174) 


where the {ex} are an arbitrary frame for the space acted on by F. The bivector 
F is independent of the choice of frame, so is an invariant quantity. One can 
easily confirm that the bivector F has the same number of degrees of freedom 
as F. If we now form 2a- F we find that 


2a- F = a- (e AF(e;)) 
=a-e’ F(e;) — e’a-F(e;) 
= F(a-e'e;) + e e;-F(a) 
= 2F(a). (4.175) 


The action of an antisymmetric function therefore reduces to contracting with 
the characteristic bivector F: 


F(a) =a-F. (4.176) 


The problem of reducing an antisymmetric function to its simplest form reduces 
to that of splitting F into a set of commuting blades: 


F = M Êi +t Akf, (4.177) 


where k < n/2 and each of the Ê, is a unit blade. This decomposition is always 
possible in Euclidean space, though the answer is only unique if the blades all 
have different magnitudes. Each component blade of F is an eigenblade of F 
and determines an invariant subspace. Within this subspace the effect of F is 
simply to rotate all vectors by +90°, and to scale the result by the magnitude 
of the eigenblade. In non-Euclidean spaces such a decomposition is not always 
possible. 


4.4.8 The singular value decomposition 


For linear functions of no symmetry a number of alternative canonical forms can 
be found. Among these, perhaps the most useful is the singular value decompo- 
sition. We start with an arbitrary function F and restrict the discussion to the 
case where F acts on an n-dimensional Euclidean space. We also suppose that 
det (F) 4 0; the case of det (F) = 0 is easily dealt with by separating out the space 
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which is mapped onto the origin, and working with a reduced function acting in 
the subspace over which F is non-singular. We next form the function D by 


D(a) = FF(a). (4.178) 


This function is symmetric and has n orthogonal eigenvectors with real, positive 
eigenvalues. The fact that the eigenvalues are positive follows from 


FF(e)=re = F(e)-F(e) = A€”. (4.179) 


Since (in Euclidean space) the square of any vector is a positive scalar we see that 
Aà must be positive. The assumption that det (F) 4 0 rules out the possibility of 
any eigenvalues being zero. It follows that we can write 


D(a) = X` A;a-e; e;, (4.180) 
i=l 


where the {e;} are the orthonormal frame of eigenvectors. Degenerate eigen- 
values are dealt with by picking a set of arbitrary orthonormal vectors in the 
invariant subspace. 

The linear function D has a simple (positive) square root, 


DV? = YA aee; (4.181) 
i=1 
and this is also invertible, 
Do? = YDA M aeie; (4.182) 
i=1 
We now set 
S = FD-1/?, (4.183) 
This satisfies 
5S = D7 !/2FFD7!/ = D71/2D D7! =|, (4.184) 


where | is the identity function. It follows that S is an orthogonal function. The 
function F can now be written 


F =SD!/, (4.185) 


This represents a series of dilations along the eigendirections of D, followed by a 
rotation. 

If the linear function F is presented as an n x n matrix of components in some 
frame, then one usually includes a further rotation R to align this arbitrary frame 
with the frame of eigenvectors. In this case one writes 


F = SAR, (4.186) 


114 


4.5 TENSORS AND COMPONENTS 


where A is a diagonal matrix in the arbitrary coordinate frame. This writes a 
matrix as a dilation sandwiched between two rotations, and is called the singular 
value decomposition of the matrix. An arbitrary linear function in n dimensions 
has n? degrees of freedom. The singular value decomposition assigns 2 x n(n — 
1)/2 of these to the two orthogonal transformations R and S, with the remaining n 
degrees of freedom contained in the dilation A. The singular value decomposition 
appears frequently in subjects such as data analysis, where it is often used in 
connection with analysing non-square matrices. 


4.5 Tensors and components 


Many modern physics textbooks are written in the language of tensor analysis. 
In this approach one often works directly with the components of a vector, or 
linear function, in a chosen coordinate frame. The invariance of the laws under 
a change of frame can then be used to advantage to simplify the component 
equations. Since this approach is so ubiquitous it is important to establish the 
relationship between tensor analysis and the largely frame-free approach of the 
present chapter. We start by analysing Cartesian tensors, and then move onto 
the more general case of an arbitrary coordinate frame. 


4.5.1 Cartesian tensors 


The subject of Cartesian tensors arises when we restrict our frames to consist 
only of orthonormal vectors in Euclidean space. For these we have 


ej-ej = ois, (4.187) 


so there is no distinction between frames and their reciprocals. In this case we 
can drop all distinction between raised and lowered indices, and just work with 
all indices lowered. Provided both frames have the same orientation, a new frame 
is obtained from the {ex} frame by a rotation, 


e; = Re; R! = Ajje;. (4.188) 
Here R is a rotor and Aj; are the components of the rotation defined by R: 
Ajj = (Re: R')-e;. (4.189) 
It follows that 
AijAik = (Re; R')-e; (Re; R')-ex 
= (R'e;R)-(Rle,R) = djx, (4.190) 
and similarly 


Nin Age = ĝij. (4.191) 
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A vector a has components a; = e;:a and these transform under a change of 
frame in the obvious manner, 


a! = e-a = Aaj. (4.192) 


It is important to realise here that it is only the components of a that change, not 
the underlying vector itself. The change in components is exactly cancelled by 
the change in the frame. Many equations in physics are invariant if the vector 
itself is transformed, but this is the result of an underlying symmetry in the 
equations, and not of the freedom to choose the coordinate system. These two 
concepts should not be confused! 

Extending this idea, we define the components of the linear function F by 


The result of this decomposition is an n x n array of components, which can be 
stored and manipulated as a matrix. This definition ensures that the components 
of the vector F(a) are given by 


e;-F(a) = e;-F(aje,) = Fijaj, (4.194) 


which is the usual expression for a matrix acting on a column vector. Similarly, 
if F and G are a pair of linear functions, the components of the product function 
FG are given by 


(FG); = FG(e;)-e; = G(e;)-F(ex) 
= G(e;)-e% e;,-F(e;) = Fix Gpj. (4.195) 


This recovers the familiar rule for multiplying matrices. If the frame is changed 
to a new rotated frame, the components of the tensor transform in the obvious 
way: 

Fig = AgAgFer, (4.196) 


where the prime denotes the components in the new (primed) frame. Objects 
with two indices are referred to as rank-2 tensors. Rank-1 tensors are vectors, 
rank-3 tensors have three indices, and so on. Since rank-2 tensors appear regu- 
larly in physics they are often referred to simply as tensors. Also, it is usual to 
let the term tensor refer to either the component form F;; or the abstract entity 
F. 

For Cartesian tensors there are two important tensors which arise regularly 
in computations. These are the two invariant tensors. The first of these is the 
Kronecker 6, which transforms as 


Sij = AikAjiðki = Mind jn = 545. (4.197) 


The components of the identity function are therefore the same in all orthonormal 
frames (and are those of the identity matrix in all cases). The second invariant is 
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the alternating tensor €;;...,, where the number of indices matches the dimension 
of the space. This is totally antisymmetric and is defined as follows: 


1 i, j, ..-, k = even permutation of 1,2,...,n 
Eijk = 4-1 i, 9,..., k = odd permutation of 1,2,...,n . (4.198) 
0 otherwise 
The order of a permutation is the number of pairwise swaps required to re- 
turn to the original order 1,2,...,n. If an even number of swaps is required 
the permutation is even, and similarly for the odd case. In three dimensions 


even permutations of 1,2,3 coincide with cyclic orderings of the indices. The 
determinant of a matrix can be expressed in terms of the alternating tensor via 


FoiF aj pee Fak EB. = det (F) Eijk- (4.199) 


Given this result, it is straightforward to prove the frame invariance of the al- 
ternating tensor under rotations: 


Eijk = AiaAjg Erg AkyEab--y = det (A) Eijk- (4.200) 


But since Aj; is a rotation matrix it has determinant +1, so the tensor is indeed 
invariant. 


4.5.2 The determinant revisited 


We should now establish that the definition of the determinant (4.199) agrees 
with our earlier definition (4.143). To prove this we first need the result that 


Eijk = e; Ae; <+- Nek IŻ, (4.201) 


where J = e,€)---e, and the {e,} form an orthonormal frame. The right- 
hand side of (4.201) is zero if any of the indices are the same, because of the 
antisymmetry of the outer product. If the indices form an even permutation of 
1,2,...,m we can reorder the vectors into the order e,e2---e, = J, in which case 
the right-hand side of (4.201) returns +1. Similarly, any anticyclic combination 
of 1,2,...,n returns —1. Together these agree with the definition (4.198) of the 
alternating tensor €;;...,. We can now rearrange the left-hand side of (4.199) as 
follows: 


FoiFaj-+ + Fyp€ap--y = FoiF gj Fyk eaAeg::-Aey It 
= F(e;)AF(e;) --- Flex) I 
= det (F) e;^e; -^ep It 
= det (F) Signi (4.202) 
which recovers the expected result. 


We assume that most readers are familiar with the various techniques employed 
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when computing the determinant of an n x n matrix. These can be found 
in most elementary textbooks on linear algebra. It is instructive to see how 
the same results arise in the geometric algebra treatment. We have already 
established that the determinant of the product of two functions is the product of 
the determinants, and that taking the adjoint does not change the determinant. 
To establish a further set of results we first introduce the (non-orthonormal) 
vectors {fi}, 


fi = Fle), (4.203) 
so that 
Fij = ei fj. (4.204) 
From equation (4.143) the determinant of F can be written 
det (F) = (fiA f2^- +A fn) (en ^ Ae2A61). (4.205) 


Expanding this product out in full recovers the standard expression for the de- 
terminant of a matrix. The first result we see is that swapping any two of the 
{fi} changes the sign of the determinant. This is the same as swapping two 
columns in the matrix F,;. Since matrix transposition does not affect the result, 
the same is true for interchanging rows. 

Next we single out one of the {e,} vectors and write 


det (F) = (1) (en A: čj Aei): (e (iA Afa) 


n 


II 


(—1)9** e5: fr (EnA čj Ae) (1A AeA fna). (4.206) 
k=1 


The final part of each term in the sum corresponds to an (n — 1) x (n — 1) 
determinant, as can be seen by comparing with (4.205). This is equivalent to 
the familiar expression for the expansion of the determinant by the jth row. A 
further useful result is obtained from the identity 


ENENGE NON A ERANG Aa DAR (4.207) 


This result means that any multiple of the kth row can be added to the jth row 
without changing the result. The same is true for columns. This is the key to 
the method of Gaussian elimination for finding a determinant. In this method 
the matrix is first transformed to upper (or lower) triangular form, so that the 
determinant is then simply the product of the entries down the leading diagonal. 
This is numerically a highly efficient method for calculating determinants. We 
can continue in this manner to give concise proofs of many of the key results for 
determinants. For a useful summary of these, see Turnbull (1960). 

To see how these formulae also lead to the familiar expression for the inverse 
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of a matrix, consider the decomposition: 
Fi = e; F7} (ej) 
= (e; e1 ^ -Aen F(en ^ Ae e;))det (F)~* 
= (—1) H (Fe A- či Aen) en A čje Ae)det (F)7?. (4.208) 


The term enclosed in angular brackets is the determinant of the (n — 1) x (n — 1) 
matrix obtained from F;; by deleting the ith column and jth row. This is the 
definition of the i, j cofactor of F;j. Equation (4.208) shows that the components 
of Fz are formed from the transposed matrix of cofactors, divided by the deter- 
minant det (F) — the familiar result. Similarly, all other matrix formulae have 
simple and often elegant counterparts in geometric algebra. Further examples of 
these are discussed in chapter 11. 


4.5.3 General tensors 


We now generalise the preceding treatment to the case of arbitrary basis sets 
in spaces of arbitrary (non-degenerate) signature. One reason for wanting to 
deal with non-orthonormal frames is that these regularly arise when working in 
curvilinear coordinate systems. In addition, in mixed signature spaces one has no 
option since it is impossible to identify a frame with its reciprocal. Suppose, then, 
that the vectors {ex} constitute an arbitrary frame for n-dimensional space (of 
unspecified signature). The reciprocal frame is denoted {e*} and the two frames 
are related by 


eej = 6. (4.209) 


Equation (4.94) for the reciprocal frame is general and still holds in mixed sig- 
nature spaces. 

As described in section 4.3.2, the vector a has components (a!,a?,...,a™) 
in the {ex} frame, and (a1,@2,...,@) in the {e*} frame. When working with 
general coordinate frames we always ensure that upper and lower indices match 
separately on either side of an expression. Suppose we now form the inner 
product of two vectors a and b. We can write this as 


a-b = (atei): (bje?) = a'b; epel = a'bjôl = a’b;. (4.210) 


The general rule is that sums are only taken over pairs of indices where one is a 
superscript and the other a subscript. Another way to write an inner product is 
to introduce the metric tensor gij: 


Jij = Gi ej. (4.211) 
In terms of its components g;; is a symmetric n x n matrix. The inverse matrix 
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is written as gt and is given by 
gi =e'-e!. (4.212) 
It is easily verified that this is the inverse of gij: 
gF guj = į e} ene; = et- ej = ô. (4.213) 


Employing the metric tensor we can write the inner product of two vectors in a 
number of equivalent forms: 


Of course, all of these expressions encode the same thing and, unless there is a 
particular reason to introduce a frame, the index-free expression a-b is usually 
the simplest to use. 

The same ideas extend to expressing the linear function F in a general non- 
orthonormal frame. We let F act on the frame vector e; and find the components 
of the result in the reciprocal frame. The components are then given by 


Again, the set of numbers F;; are referred to as the components of a rank-2 
tensor and form an n x n matrix, the entries of which depend on the choice of 
frame. Similar expressions exist for combinations of frame vectors and reciprocal 
vectors, for example, 


F“ = F(t) e. (4.216) 
One use of the metric tensor is to interchange between these expressions: 
FË = e. F(e) = e-e" e,,-F(eje’-e) = gg? Fy. (4.217) 


Again, we have at our disposal a variety of different ways of encoding the infor- 
mation in F. In terms of the abstract concept of a linear operator, the metric 
tensor gij is simply the identity operator expressed in a non-orthonormal frame. 
If Fij are the components of F in some frame then the components of F are 
given by 
Fij = F(e;)-e; = e;-F(e;) = Eji: (4.218) 
That is, viewed as a matrix, the components of F are found from the components 
of F by matrix transposition. For mixed index tensors we have to be slightly more 
careful, as we now have 


F; = F(e’)-e; = e -F(e;) = F',. (4.219) 


If F is a symmetric function we have F = F. In this case the component matrices 
satisfy 


Fij = F(e;)-e; = F(e;) -ej = Fj, (4.220) 
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so the components F;; form a symmetric matrix. The same is true of F% = FI", 
but for the mixed tensor F;? we have F; = F/;. 
The components of the product function FG are found from the following 
rearrangement: 
(FG)ij = FG(e;)-e; = G(e;):F(e;) 
= G(e;)-e, e* -F(e;) = Fi” Gy; (4.221) 


Provided the correct combination of subscript and superscript indices is used, 
this can be viewed as a matrix product. Alternatively, one can work entirely 
with subscripted indices, and include suitable factors of the metric tensor, 


(FG)i; = FirGijg’’. (4.222) 


Higher rank linear functions give rise to higher rank tensors. Suppose, for 
example, that $(a1, a2,a3) is a scalar function of three vectors, and is linear on 
each argument, 


(Aa, + pb, a2, a3) = AP(a1, a2, a3) + Elb, a2,a3), etc. (4.223) 
The components of this define a rank-3 tensor via 
ijk = (Ci, €j, €k). (4.224) 


Using similar schemes it is a straightforward matter to set up a map between 
tensor equations and frame-free expressions in geometric algebra. 


4.5.4 Coordinate transformations 


If a second non-orthonormal frame {fẹ} is introduced we can relate the two 
frames via a transformation matrix fai: 


fai = fai, Äi = fel, (4.225) 


where Latin and Greek indices distinguish the components in one frame from 
the other. These matrices satisfy 


faif” = farei f*-e = eiet = 6 (4.226) 
and 
faif” = fare; fP = faf? = 88. (4.227) 
The decomposition of the vector a in terms of these frames gives 
a = ate; = a’f%e;-fa. = a" faif”. (4.228) 
If follows that the transformation law for the components is 


da hee (4.229) 
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with similar expressions holding for the superscripted components. 
These formulae extend simply to include linear functions. For example, we see 
that 


Fag = foifaiF”. (4.230) 


Again, similar expressions hold for superscripts and for mixtures of indices. In 
particular we have 


Fae tie PEs: (4.231) 


Expressed in terms of matrix multiplication, this would be an equivalence trans- 
formation. Of course, the abstract frame-free function F is unaffected by any 
change of basis. All that changes is the particular representation of the function 
in the chosen coordinate system. Any set of n? numbers with this transformation 
property are called the components of a rank 2 tensor, the implication being that 
the underlying function is frame-independent. 

In conventional accounts, the subject of tensors is often built up by taking 
the transformation law as fundamental. That is, a vector (rank-1 tensor) is 
defined as a set of components which transform according to equation (4.229) 
under a change of basis. Once one has the tools available to treat vectors and 
linear operations in a frame-free manner, such an approach becomes entirely 
unnecessary. The defining property of a tensor is that it represents a genuine 
geometric object (or operation) and does not depend on a choice of frame. Given 
this, the transformation laws (4.229) and (4.231) follow automatically. In this 
book the name tensor is applied to any frame-independent linear function, such 
as F. We will encounter a variety of such objects in later chapters. 


4.6 Notes 


The realisation that geometric algebra is a universal tool for physics was a key 
point in the modern development of the subject, and was first strongly promoted 
by David Hestenes (figure 4.5). Before his work, physicists’ sole interaction 
with geometric algebra was through the quantum theory of spin. The Pauli 
and Dirac matrices form representations of Clifford algebras, a fact that was 
realised as soon as they were introduced. But in the 50 years since Clifford’s 
original idea, the geometry behind his algebra had been lost as mathematicians 
concentrated on its algebraic properties. This discovery of the Pauli and Dirac 
matrices thus gave rise to two mistaken beliefs. The first was that there was 
something intrinsically quantum-mechanical in the non-commutative properties 
of the matrices. This is clearly not the case. Clifford died long before quantum 
theory was first formulated and was motivated entirely by classical geometry, 
and his algebra is today routinely employed in a range of subjects far removed 
from quantum theory. 
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Figure 4.5 David Hestenes. Inventor of geometric calculus and first to 
draw attention to the universal nature of geometric algebra. He wrote 
the influential Space-Time Algebra in 1966, and followed this with a fully 
developed formalism in Clifford Algebra to Geometric Calculus (Hestenes 
& Sobczyk, 1984). This was followed by the (simpler) New Foundations 
for Classical Mechanics, first published in 1986 (second edition 1999). In 
a series of papers Hestenes and coworkers showed how geometric algebra 
could be applied in the study of classical and quantum mechanics, electro- 
dynamics, projective and conformal geometry and Lie group theory. More 
recently, he has advocated the use of geometric algebra in the field of com- 
puter graphics. 


The second widespread belief was that matrices were crucial to understanding 
the properties of Clifford algebras. This too is erroneous. The geometric algebra 
of a finite-dimensional vector space is an associative algebra, so always has a ma- 
trix representation. But these matrices add little, if anything, to understanding 
the properties of the algebra. Furthermore, an insistence on working with ma- 
trices deters one from applying geometric algebra to anything beyond the lowest 
dimensional spaces, because the size of the matrices increases exponentially with 
the dimension of the space. Working directly with the elements of the algebra 
imposes no such constraints, and one can easily apply the ideas to spaces of any 
dimension, including infinite-dimensional spaces. 

Mathematicians had few such misconceptions, and Atiyah and others devel- 
oped Clifford algebra as a powerful tool for geometry. Even in these develop- 
ments, however, the emphasis was usually on Clifford algebra as an extra tool 
on top of the standard techniques for solving geometric problems. The algebra 
was seldom used as complete language for geometry. The picture first started 
to change when Hestenes recovered Clifford’s original interpretation of the Pauli 
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matrices. This led Hestenes to question whether the appearance of a Clifford 
algebra was telling us something about the underlying structure of quantum 
theory. Hestenes then went on to promote the universal nature of the algebra, 
which he publicised in a series of books and papers. Acceptance of this view 
is growing and, while not everyone is in full agreement, it is now hard to find 


an area of physics to which geometric algebra cannot or has not been applied 
without some degree of success. 


4.1 


4.2 


4.3 


4.4 


4.5 


4.7 Exercises 


Prove that the outer product of a set of linearly dependent vectors van- 
ishes. 
In a Euclidean space, Gram-Schmidt orthogonalisation proceeds by suc- 
cessively replacing each vector in a set {a;} by one perpendicular to the 
preceding vectors. Prove that such a vector is given by 

ej = ay — = “i ej. 


Es 
j=1 9 


Prove that we can also write this as 
ei = aj Aajy_1A+++Aay(ay_1A-+*Aay)7?. 
Prove that 
(anb) x (cAd) = b-caAd —a-chbAd + a-dbAc— b-dadc. 


The length of a vector in Euclidean space is defined by |a| = \/(a?), and 
the angle 0 between two vectors is defined by 


cos() = a-b/(a\lb)). 


Show that a linear transformation F which leaves lengths and angles 
unchanged must satisfy 


Peri 


What does this imply for the determinant of F? A reflection in the 
(hyper)plane perpendicular to n is defined by 


R(a) = —nan, 


where n? = 1. Show that R = RT}, and that R has determinant —1. 
For the reflection in the preceding question introduce a suitable basis 
frame and express F in terms of a matrix F;;j. Verify the results for the 
determinant and inverse of this matrix. (Hint — align one of the basis 
vectors with n.) 
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4.6 


4.7 


4.8 


4.9 


4.10 


A rotor R is defined by 
R = exp(—AB/2). 
By Taylor expanding in À, prove that the operation 
R(A) = RARÝ 


preserves the grade(s) of the multivector A. 

Show that the plane B is unchanged by the rotation defined by the rotor 
R = exp(B/2). 

Analyse the properties of the matrix 


(; ae) . 


To what geometric operation does this matrix correspond? Can this 
matrix be diagonalised, and does it have a sensible singular value de- 
composition? 

Suppose that the linear transformation F has a complex eigenvector e+i f 
with associated eigenvector a + iĝ. What is the effect of F on the eA f 
plane? How should one interpret the action of F in this plane? 
Suppose that the vectors {e,} form an orthonormal basis frame for n- 
dimensional Euclidean space. What is the effect of the transformation 


T(a) = a + Aa-e1 e2 


on the rows of the matrix F,; formed by decomposing F in the {ex} 
frame? Use this result to prove that the determinant of a matrix is 
unchanged by adding a multiple of one row to another. 
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Relativity and spacetime 


The geometric algebra of spacetime is called the spacetime algebra. Historically, 
the spacetime algebra was the first modern implementation of geometric algebra 
to gain widespread attention amongst the physics community. This is because 
it provides a synthetic framework for studying spacetime physics. There are 
two main approaches to the study of geometry, which can be loosely referred to 
as the algebraic and synthetic traditions. In the algebraic approach one works 
entirely with the components of a vector and manipulates these directly. Such 
an approach leads naturally to the subject of tensors, and places considerable 
emphasis on how coordinates transform under changes of frame. The synthetic 
approach, on the other hand, treats vectors as single, abstract entities x or a, 
and manipulates these directly. Geometric algebra follows in this tradition. 

For much of modern physics the synthetic approach has come to dominate. 
The most obvious examples of this are classical mechanics and electromagnetism, 
both of which helped shape the development of abstract vector calculus. For 
these subjects, presentations typically perform all of the required calculations 
with the three-dimensional scalar and cross products. We have argued that geo- 
metric algebra provides extra efficiency and clarity, though it is not essential 
to a synthetic treatment of three-dimensional physics. But for spacetime cal- 
culations the cross product cannot be defined. Despite the obvious advantages 
of synthetic treatments, most relativity texts revert to a more basic, algebraic 
approach involving the components of 4-vectors and Lorentz-transform matri- 
ces. Such an approach has trouble encoding such basic notions as a plane in 
spacetime and, unsurprisingly, does a very poor job of handling the dynamics of 
extended bodies. 

To develop a generally applicable algebra of vectors in spacetime one has 
little option but to use either geometric algebra, or the language of exterior 
forms (which is essentially a subset of geometric algebra which only employs 
the interior and exterior products). This is why relativistic physics still tends 


126 


5.1 AN ALGEBRA FOR SPACETIME 


to dominate the literature of applications of geometric algebra. Many aspects 
of special relativity become clearer when viewed in the language of geometric 
algebra and, crucially, a wealth of new computational tools is provided which 
dramatically simplify relativistic problems. 


5.1 An algebra for spacetime 


It is not our intention in this chapter to give a fully self-contained introduction to 
relativity. Such an account can be found in the various books listed at the end of 
this chapter. In brief, a series of famous experiments conducted in the latter half 
of the nineteenth century showed that light did not appear to behave in quite 
the expected, Newtonian manner. This led Einstein to his ‘second postulate’, 
that the speed of light c is the same for all inertial (non-accelerating) observers. 
Combined with Einstein’s ‘first postulate’, the principle of relativity, one is led 
inexorably to special relativity. The principle of relativity states simply that 
all inertial frames are equivalent for the purposes of physical experiment. An 
immediate consequence of these postulates is that the underlying geometry is no 
longer that of a (Euclidean) three-dimensional space, but instead the appropriate 
arena for physics is (Lorentzian) spacetime. 

To understand why this is the case, suppose that a spherical flash of light is 
sent out from a source, and this event is described in two coordinate frames. We 
discuss the concept of a frame, as distinct from a single observer, later in this 
chapter. The frames are in relative motion, and their origins coincide with the 
location of the source at the moment the light is emitted. At this instant both 
frames also set their time measurements to zero. In the first frame the source is 
at rest and the light expands radially according to the equation 


r= ct. (5.1) 


But the second frame must also record a radially expanding shell of light since 
the relative velocity of the source has no effect on the speed of light. The second 
frame therefore sees light expanding according to the equation 


r =c. (5.2) 


Since the two frames are in relative motion, points at a given fixed r cannot 
coincide with those at a fixed r’. So points reached at the same time in one 
frame are reached at different times in the second frame. But in both frames 
the light lies on a spherical expanding shell. So the one thing that is common to 
both frames is the value of 


(ct)? — r? = (et)? — (r’)? = 0. (5.3) 


This defines the invariant interval of special relativity and is the fundamental 
algebraic concept we need to encode. 
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The preceding argument shows us that the algebra we need to construct is 
generated by four orthogonal vectors {70,71, 72,73} satisfying the algebraic re- 
lations 


A=, y=, Ye = sij, (5.4) 


where ¿ and j run from 1 to 3. These are summarised in relativistic notation as 


YW = Ny = diag(+ — — —), m, =0,...,3. (5.5) 


The notation {y,} for a spacetime frame is a widely adopted convention in 
the spacetime algebra literature. The notation is borrowed from Dirac theory 
and we continue to employ it in this book. We have also chosen the ‘particle 
physics’ choice of signature, which has spacelike vectors with negative norm. 
General relativists often work with the opposite signature and swap all of the 
signs in 7,,. Both choices have their advocates and all (known) physical laws 
are independent of the choice of signature. Throughout we use Latin indices to 
denote the range 1-3 and Greek for the full spacetime range 0-3. 

The {y,} vectors are dimensionless, as is clear from their squares. Since we 
are in a space of mixed signature, we must adopt the conventions of section 4.3 
and distinguish between a frame and its reciprocal. For the {y,} frame the 
reciprocal frame vectors, {7}, have y° = yo and y' = —7;. A general vector in 
the spacetime algebra can be constructed from the {y,} vectors. A spacetime 
event, for example, is encoded in the vector x, which has coordinates x" in the 
{7u} frame. Explicitly, the vector x is 


£ = TË Yp = cto + tyi, (5.6) 


which has dimensions of distance. From this point on it will be convenient to 
work in units where the speed of light cis 1. Factors of c can then be inserted in 
any final result if the answer is required in different units. The mixed signature 
means that the square of a vector (a, say) is no longer necessarily positive, and 
instead we have 


a” = aa = ceļa’. (5.7) 


c€ is the signature of the vector and can be +1 or 0. The mixed signature does 
not affect the validity of the axiomatic development and results of chapter 4, 
which made no reference to the signature. 


5.1.1 The bivector algebra 


There are 4 x 3/2 = 6 bivectors in our algebra. These fall into two classes: 
those that contain a timelike component (e.g. yi A yo), and those that do not 
(e.g. yi Aq). For any pair of orthogonal vectors a and b, a-b = 0, we have 


(anb)? = abab = —abba = —a°b?. (5.8) 
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Time 


Yo 


Space 
e 


yı 


Figure 5.1 A spacetime diagram. Spacetime diagrams traditionally have 
the t axis vertical, so a suitable bivector for this plane is 170. 


The two types of bivectors therefore have different signs of their squares. First, 
we have 


(WAY)? = =q: = -1, (5.9) 


which is the familiar result for Euclidean bivectors. Each of these generates 
rotations in a plane. For bivectors containing a timelike component, however, 
we have 


(wA) =- w = +1. (5.10) 


Bivectors with positive square have a number of new properties. One immediate 
result we notice, for example, is that 
2 3 
a a 
e2170 =14 ayıyo 4 z } 3] 1790 +: 


= cosh(a) + sinh(a)yı70. (5.11) 


This shows us that we are dealing with hyperbolic geometry. This will prove 
crucial to our treatment of Lorentz transformations. Traditionally, spacetime 
diagrams are drawn with the time axis vertical (see figure 5.1). For these dia- 
grams the ‘right-handed’ bivector is, for example, y1yo. These bivectors do not 
generate 90° rotations, however, as we now have 


w(u) =—-N, (10) = —70- (5.12) 


5.1.2 The pseudoscalar 
We define the (grade-4) pseudoscalar I by 


I = 0717273: (5.13) 


In the literature the symbol 7 is often used for the pseudoscalar. We have de- 
parted from this practice to avoid confusion with the ¿i of quantum theory. Us- 
ing the latter symbol presents a potential problem because of the fact that the 
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pseudoscalar anticommutes with vectors. The pseudoscalar defines an orienta- 
tion for spacetime, and the reason for the above choice will emerge shortly. We 
still assume that {y1, %2, %3} form a right-handed orthonormal set, as usual for 
a three-dimensional Cartesian frame. Since J is grade-4, it is equal to its own 
reverse: 


I = 73727170 = I. (5.14) 


For relativistic applications we use the tilde ~ to denote the reverse operation. 
The problem with the alternative symbol, the dagger f, is that it is usually 
reserved for a different role in relativistic quantum theory. The fact that =I 
makes it easy to compute the square of I : 


P = IT = (Y019273)(73927170) = —1. (5.15) 


Multiplication of a bivector by J results in a multivector of grade 4 — 2 = 2, so 
returns another bivector. This provides a map between bivectors with positive 
and negative squares, for example 


Iyo = nW = 1070717213 = —V273. (5.16) 
If we define B; = yiyo then the bivector algebra can be summarised by 
Bi x B; = €ijk I Bp, 
(1B;) x (IB;) = —€ijk IBk, (5.17) 
(IBi) x B; = —€ijk Be. 
These equations show that the pseudoscalar provides a natural complex structure 
for the set of bivectors. This in turn tells us that there is a complex structure 
hidden in the group of Lorentz transformations. 
As well as the four vectors, we also have four trivectors in our algebra. The 
vectors and trivectors are interchanged by a duality transformation, 


7117273 = WWV = Yo! = — 10. (5.18) 


The pseudoscalar I anticommutes with vectors and trivectors, as we are in a 
space of even dimensions. As always, I commutes with all even-grade multivec- 
tors. 


5.1.3 The spacetime algebra 


Combining the preceding results, we arrive at an algebra with 16 terms. The 
{7u} define an explicit basis for this algebra as follows: 


l {Yu} {YA wt {Ly} I 
l scalar 4 vectors 6 bivectors 4 trivectors 1 pseudoscalar 
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This is the spacetime algebra, G(1,3). The structure of this algebra tells us 
practically all one needs to know about (flat) spacetime and the Lorentz trans- 
formation group. A general element of the spacetime algebra can be written 
as 


M=a+a+B+Ib+ I, (5.19) 


where a and £ are scalars, a and b are vectors and B is a bivector. The reverse 
of this element is 


M=a+a-—B-I1b+If£. (5.20) 
The vector generators of the spacetime algebra satisfy 


Vu Yv F Wp = 2p (5.21) 


These are the defining relations of the Dirac matrix algebra, except for the 
absence of an identity matrix on the right-hand side. It follows that the Dirac 
matrices define a representation of the spacetime algebra. This also explains our 
notation of writing {y,} for an orthonormal frame. But it must be remembered 
that the {7,} are basis vectors, not a set of matrices in ‘isospace’. 


5.2 Observers, trajectories and frames 


From a study of the literature on relativity one can easily form the impression 
that the subject is in the main concerned with transformations between frames. 
But it is the subject of relativistic dynamics that is of primary importance to 
us, and one aim of the spacetime algebra development is to minimise the use of 
coordinate frames. Instead, we aim to develop spacetime physics in a frame-free 
manner and, where necessary, then focus on the physics as seen from different 
observers. Developing relativistic physics in this manner has the added advan- 
tage of clarifying precisely which aspects of special relativity need modification 
to incorporate gravity. 


5.2.1 Spacetime paths 


Suppose that x(A) describes a curve in spacetime, where À is some arbitrary, 
monotonically-increasing parameter along the curve. The tangent vector to the 
curve is 


dx(X) 
fn. 
£=- (5.22) 
Under a change of parameter from A to 7 the tangent vector becomes 
dx dàdz 
ea Ce) 
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It follows that 


Y= GY 


so the sign of (x’)? is an invariant feature of the path. We assume for simplicity 
that this sign does not change along the path. As we are working in a space of 
mixed signature there are then three cases to consider. 

The first possibility is that (2’)? > 0, in which case the path is said to be 
timelike. Timelike trajectories are those followed by massive particles. For these 
paths we can define an invariant proper interval 


2 (dx dx 1/2 
= — + — : 2 
AT ib (F a) dX (5.25) 


It is straightforward to check that this interval is independent of how the path 
is parameterised. If we consider the simplest case of a particle (or observer) at 
rest in the yo system, its spacetime trajectory can be written as x = ty. In this 
case it is clear that the interval defines the elapsed time in the observer’s rest 
frame. This must be true for all possible paths, so the interval (5.25) defines the 
time as measured along the path. This is called the proper time, and is usually 
given the symbol r. The proper time defines a preferred parameter along the 
curve with the unique property that the velocity v, 


dx 
t= = 2 (5.26) 
satisfies 
v =l. (5.27) 


Throughout we use dots to denote differentiation with respect to proper time 
T. The unit timelike vector v then defines the instantaneous rest frame. The 
definition of ‘proper time’ makes it clear that in relativity observers moving in 
relative motion measure different times. 

The second case to consider is that (x’)? = 0. In this case the trajectory is 
said to be lightlike or null. Null trajectories are followed by massless (point) 
particles and (in the geometric optics limit) they define possible photon paths. 
There is no preferred parameter along these curves, and the proper distance (or 
time) measured along the curve is 0. Photons do still carry an intrinsic clock, 
defined by their frequency, but this can tick at an arbitrary rate. 

The third possibility is that (x’)? < 0, in which case the trajectory is said 
to be spacelike. As with timelike paths there is a preferred (affine) parameter 
along the path such that (x’)? = —1. In this case the parameter defines the 
proper distance. Spacelike curves cannot arise for the trajectories of (known) 
particles, which are constrained to move at less than (or equal to) the speed 
of light. Events which are separated by spacelike intervals cannot be in causal 
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Timelike 


A uF 


“  Lightlike 


Elsewhere \v______ = 


Figure 5.2 Spacetime trajectories. There are three different types of space- 
time trajectory: timelike, lightlike and spacelike. The set of lightlike tra- 
jectories through a point separate spacetime into three regions: the past, 
the future and ‘elsewhere’. 


contact with each other and cannot exert any classical influence over each other. 
The three possibilities for spacetime trajectories are summarised in figure 5.2. 


5.2.2 Spacetime frames 


The subject of spacetime frames and coordinates dominates many discussions of 
the meaning of special relativity. The concept of a frame is distinct from that 
of an observer as it involves the notion of a coordinate lattice. We start with an 
inertial observer with constant velocity v. This velocity vector is then equated 
with the timelike vector ep from a spacetime frame {e,,}. The remaining vectors 
e; are chosen so that they form a right-handed set of orthonormal spacelike 
vectors perpendicular to eọ = v. The {e,,} then define a set of frame vectors 
satisfying 

en èy = Nv: (5.28) 


So far these vectors are only defined at a single point on the observer’s trajectory. 
We now assume that the vectors extend throughout all spacetime, so that any 
event can be given a set of spacetime coordinates 


ch = eg. (5.29) 
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Clearly these coordinates are a rather distinct concept from what an observer will 
actually measure, since the observer is constrained to remain in one place and 
only receives incoming photons. Frequently one sees discussions involving arrays 
of clocks all cleverly synchronised to read the time x° at each spatial location. 
But how such a frame is set up is not really the point. The assertion is that the 
coordinates as specified above are a reasonable model for the sort of distance and 
time measurements performed in a laboratory system using physical measuring 
devices. It is precisely this assertion that is challenged by general relativity, 
which insists that one talk entirely in terms of physically-defined coordinates, so 
that the xz” defined above have no physical meaning. That said, for applications 
not involving gravity and for non-accelerating frames, we can safely identify the 
coordinates defined above with physical distances and times and will continue 
to do so in this chapter. 


5.2.38 Relative vectors 


Now suppose that we follow a timelike path with instantaneous velocity v, v? = 1. 
What sort of quantities do we measure? First we construct a frame of rest vectors 
{e;} perpendicular to v = eg. We also take a point on the worldline as the spatial 
origin. Then a general event x can be decomposed in this frame as 


x = teo + x'e;, (5.30) 
where the time coordinate is 
t = £-e0 = £V (5.31) 
and spatial coordinates are 
t= re. (5.32) 


Suppose now that the event is a point on the worldline of an object at rest in 
our frame. The three-dimensional vector to this object is 


ve, = z-e” e, —x-e e0 = £ — T -U v = TAVV. (5.33) 


Wedging with v projects onto the components of the vector x in the rest frame 
of v. The key quantity is the spacetime bivector x^v. We call this the relative 
vector and write 


gz = z/w. (5.34) 
With these definitions we have 
wv=xvutacAv=t+e. (5.35) 
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The invariant distance now decomposes as 


x? = zvvg = (a-v+x2Av)(x-vt vAz) 
= (t+ æ)(t— x) =f —2’, (5.36) 


recovering the invariant interval. A second observer with a different velocity 
performs a different split of x into time and space components. But the interval 
x? is the same for all observers as it manifestly does not depend on the choice 
of frame. 


5.2.4 The even subalgebra 


Each observer sees a set of relative vectors, which we model as spacetime bivec- 
tors. What algebraic properties do these have? To simplify matters, we take 
the timelike velocity vector to be yọ and introduce a standard frame of relative 
vectors 


These define a set of spacetime bivectors representing timelike planes. (The 
notation is again borrowed from quantum mechanics and is commonplace in the 
spacetime algebra literature.) The {o;} satisfy 


Tioj = 5(V%i075 70 + 1770710) 


1 
2 
SiTi — WN) = ij. (5.38) 
These act as vector generators for a three-dimensional algebra. This is the geo- 


metric algebra of the relative space in the rest frame defined by yo. Furthermore, 
the volume element of this algebra is 


010203 = (71Y0)(7270) (730) = —V1 07273 = T, (5.39) 


so the algebra of relative space shares the same pseudoscalar as spacetime. This 
was the reason for our earlier definition of J. Of course, we still have 


tloij om a 50%) = Eijk] Ok, (5.40) 


so that both relative vectors and relative bivectors are spacetime bivectors. 

The even-grade terms in the spacetime algebra define the even subalgebra. As 
we have just established, this algebra has precisely the properties of the algebra 
of three-dimensional (relative) space. The even subalgebra contains scalar and 
pseudoscalar terms, and six bivector terms. These are split into three timelike 
vectors and three spacelike vectors, which in turn become relative vectors and 
bivectors. This is called a spacetime split, and it is observer-dependent. Different 
velocity vectors generate different spacetime splits. Algebraically, this provides 
us with an extremely efficient tool for comparing physical effects in different 
frames. 
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Spacetime bivectors which are also used as relative vectors are written in 
bold. This conforms with our earlier usage of a bold face for vectors in three 
dimensions. There is a potential ambiguity here — how are we to interpret the 
expression a/\b? Our convention is that if all of the terms in an expression are 
bold, the dot and wedge symbols drop down to their three-dimensional meaning, 
otherwise they take their spacetime definition. This works pretty well in prac- 
tice, though where necessary we will try to draw attention to the fact that this 
convention is in use. 


5.2.5 Relative velocity 


Suppose that an observer with constant velocity v measures the relative velocity 
of a particle with proper velocity u(r) = (T), u? = 1. We have 


d d 
uv = g eow) E at +x), (5.41) 
where t + a is the description of the event x in the v frame. It follows that 
dt dx 
se “= ; .42 
oA uAv (5.42) 


The relative velocity u as measured in the v frame is therefore 


dx dxdr  uAv 
= 2 ; A 
dt dr dt UU pe) 


This construction of the relative velocity is extremely elegant. It embodies the 


i 


concept of relativity in its precise (anti)symmetry. If we interchange u and v the 
second observer measures precisely the same relative speed as the first, but in 
the opposite direction. Expressions like uAv/u-v arise frequently in the subject 
of projective geometry (see section 10.1). The resulting bivector is homogeneous, 
which is to say we can rescale u and v and still recover the same result. So the 
choice of parameterisation of the two spacetime trajectories is irrelevant to their 
relative velocity. The relative velocity is determined solely by the spacetime 
trajectories themselves, and not by any evolution parameter. 
The definition of the relative velocity ensures that the magnitude is 
(unv)? 1 


wot = 7 wa h (5.44) 


so no two observers measure a relative velocity greater than the speed of light 


(which is 1 in our current choice of units). If we form the Lorentz factor 7 using 
y? Sya u? 


= 1 + (uv)? [(uv — u-v)(vu — v-u)] = (u-v)7?, (5.45) 
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we find that y = u-v. It follows that we can decompose the velocity as 
u = uvv =(wutudAv)v = (1+ u)v, (5.46) 


which shows a neat split into a part yuv in the rest space of v, and a part yv 
along v. 


5.2.6 Momentum and wave vectors 


The relativistic definitions of energy and momentum can be motivated in various 
ways. Perhaps the simplest is to consider photons with frequency w and wave- 
vector k measured in the yo frame. From quantum theory, the energy and 
momentum are given by fiw and fik respectively. If we define the wavevector k 
by 


k = wyo + kin, (5.47) 
then the energy-momentum vector for the photon is simply 
p=ħk. (5.48) 


An observer with velocity v, as opposed to yo, measures energy and momentum 
given by 


E= pv, p= p^. (5.49) 


We take this as the correct definition for massive particles as well. So a particle 
of rest mass m and velocity u has an energy-momentum vector p = mu. A 
spacetime split of this vector with the velocity vector v yields 


pv =pvu+pAv=E+p. (5.50) 


A significant feature of this definition is that the relative momentum is related 
to the velocity by 


p = muv u = ymu, (5.51) 


where again y is the Lorentz factor. One sometimes sees this formula written in 
terms of a velocity-dependent mass m’ = ym, but we will not adopt this practice 
here. 

From the definition of p we recover the invariant 


m* = p’ = puup = (E+ p)(E — p) = E? — p°. (5.52) 
Similarly, for a photon with wavevector k, k? = 0, we have 
0 = kwvk = (w + k) (w — k) = w? — k’. (5.53) 
This recovers the relation |k| = w, which holds in all frames. 
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5.2.7 Proper acceleration 


A final ingredient in the formulation of relativistic dynamics is the proper accel- 
eration. A particle follows a trajectory x(7), where 7 is the proper time. The 
particle has velocity v = t, v? = 1. The proper acceleration is simply 


du 
_ eo . 4 
U= (5.54) 
Since v? = 1, the velocity and acceleration are perpendicular 
d yt) =0 = 2 (5.55) 
g) =O = 2w. . 


In many physical phenomena it turns out that a more useful concept is provided 
by the acceleration bivector 


B, = bAv = ùv. (5.56) 


This bivector denotes the acceleration projected into the instantaneous rest frame 
of the particle. Typically this bivector multiplied by the rest mass is equated 
with a bivector encoding the forces acting on the particle. Any change in the 
parameter along the curve will rescale the velocity vector, so B, can be written 
as 
v' Av 

By = (ov) 872 (5.57) 
which is independent of the parameterisation of the trajectory. 

Before applying the various preceding definitions to a range of dynamical prob- 
lems, we turn to a discussion of the Lorentz transformations. This will pave the 
way for a powerful method for studying relativistic problems which is unique to 
geometric algebra. 


5.3 Lorentz transformations 


Lorentz transformations are usually expressed in the form of a coordinate trans- 
formation. We suppose that two inertial observers have set up ‘coordinate lat- 
tices’ in their own rest frames, as discussed in section 5.2.2. We denote these 
frames by S and S’, and assume that they are set up such that their 1 and 2 
axes coincide, but that S” moves at (scalar) velocity Gc along the 3 axis as seen 
in the S frame. We denote the 0 and 3 components by t and z respectively. If 
the origins of the frames coincide at t = t = 0, the coordinates of the same 
spacetime event as measured in the two frames are related by 


= y(t — Bz), al = zi, r?” = a, 2’ =qŅ(z-— 6b), (5.58) 
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where y = (1 — 6?)~1/? and £ is the velocity in units of c (8 < 1). The inverse 
relations are easily found to be 


t=7(t' +62), =r", waa, 2=7(2' + Bt). (5.59) 


The arguments leading to these transformation laws are discussed in all intro- 
ductory texts on relativity (see e.g. Rindler (1977) or French (1968)). 

To get a clearer understanding of this transformation law we must first convert 
these relations into a transformation law for the frame vectors. The vector x has 
been decomposed in two frames, {e} and {e/,}, so that 


r =g e SO" e (5.60) 
We then have, for example, 
t=., Y=. (5.61) 


Concentrating on the 0 and 3 components we have 
teo + ze3 = t'e) + 2’e5, (5.62) 
and from this we derive the vector relations 
eo = (eo + Bes), e3 = y(es + beo). (5.63) 


These define the new frame in terms of the old. As a check the new frame vectors 
have the correct normalisation, 


(=-= (6)? = -1 (5.64) 


The geometry of this transformation is illustrated in figure 5.3. 
We saw earlier that bivectors with positive square lead to hyperbolic geometry. 
This suggests that we introduce an ‘angle’ œ with 


tanh(a) = 8 (5.65) 
so that 


y= (1- tanh? (a)) = cosh(a). (5.66) 
The vector eg is now 


eg = cosh(a) eo + sinh(a) eg 
(cosh(a) + sinh(a) e3e9) eo 


II 


= exp(a e320) €o, (5.67) 


where we have expressed the scalar + bivector term as an exponential. Similarly, 
we have 


es = cosh(a) e3 + sinh(a) eo = exp(a ezeo) e3. (5.68) 


Now recall that these are just two of four frame vectors, and the other pair 
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Figure 5.3 A Lorentz transformation. The transformation leaves the mag- 
nitude of a vector invariant. As the underlying geometry of a spacetime 
plane is Lorentzian, vectors of constant magnitude lie on hyperbolae, rather 
than circles. The transformed axes define a new coordinate grid. 


are unchanged by the transformation. Since e3eọ anticommutes with eg and es, 
but commutes with e; and e2, we can express the relationship between the two 
frames as 


e = Re,R, ee” = Re" R, (5.69) 
where 


R = e%e3¢0/2, (5.70) 


The same rotor prescription introduced for rotations in Euclidean space also 
works for boosts in relativity! This is dramatically simpler than having to work 
with 4 x 4 Lorentz transform matrices. 


5.3.1 Addition of velocities 


As a simple example, suppose that we are in a frame with basis vectors {7,,}. 
We observe two objects flying apart with 4-velocities 


yy = e®110/2yge721710/2 — GUN (5.71) 


and 


Va = E7221 10/2 p92 10/2 — 74210), (5.72) 
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What is the relative velocity they see for each other? We form 


vv _ (ela + o2)7170) — sinh(ay + a2)7170 


= = 5.73 

V12 (ela ae 02)7170) cosh(ay + ag) ( ) 
Both observers therefore measure a relative velocity of 
tanh tanh 

tanh(a; + a2) = guhit) P taniga) (5.74) 


~ 1+ tanh(a;) tanh(a2)’ 


Addition of (collinear) velocities is achieved by adding hyperbolic angles, and 
not the velocities themselves. Replacing the tanh factors by the scalar velocities 
u = ctanh(a) recovers the more familiar expression 


; ui + U2 


The surprising conclusion is that addition of velocities in spacetime is really a 
generalized rotation in a hyperbolic space! Quite dramatically different from the 
Newtonian prescription of simple vector addition of the velocities. 


5.3.2 Photons, Doppler shifts and aberration 


For many relativistic applications involving the properties of light it is sufficient 
to use a simplified model of a photon as a point particle following a null tra- 
jectory. The tangent vector to the path is the wavevector k. This provides for 
simple formulae for Doppler shifts and aberration. Suppose that two particles 
follow different worldlines and that particle 1 emits a photon which is received 
by particle 2 (see figure 5.4). The frequency seen by particle 1 is w; = v1-k, and 
that by particle 2 is w2 = vg-k. The ratio of these describes the Doppler effect, 
often expressed as a redshift, z: 


laga See 


II 


en (5.76) 


This can be applied in many ways. For example, suppose that the emitter is 
receding in the 7 direction, and v2 = yo. We have 


k = w2(%0 + 71), vı = cosh (a) yo — sinh (a) 71, (5.77) 
so that 
pe (cosh(a) + sinh(a)) sales (5.78) 
w2 


The velocity of the emitter in the 7 frame is tanh(q), and it is easy to check 
that 


a _ (1+ tanh(a) 12 
e? = (a) ; (5.79) 
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UL Yi 


Va U2 


Figure 5.4 Photon emission and absorption. A photon is emitted by par- 
ticle 1 and received by particle 2. 


This formula recovers the standard expression for the relativistic Doppler effect: 


f= 1/2 
W2 = (G5) Wi. (5.80) 


In its current form this formula is appropriate for a source and receiver moving 
away from each other at velocity 8c. Had they been approaching each other the 
sign of 3 would be reversed, leading to an increased frequency at the receiver (a 
blueshift). 

Aberration formulae can be obtained in a similar manner. Suppose that ob- 
server 1 has velocity yo, and that this observer receives photons at an angle 0 to 
the 1 axis in the 12 plane. The photons are therefore on a null trajectory with 
tangent vector 


n = y — cos(@) yı — sin(@) 7, (5.81) 
and the yo observer recovers the angle 0 via 


NY? 
tan(@) = f 5.82 
OE (5.82) 
Suppose now that a second observer moves with velocity ( relative to the first 
along the 1 axis. This observer’s velocity is 


v = eg = cosh(a) yo + sinh (a) yı (5.83) 
and the frame vectors for this observer are 
e; = cosh(a) yı +sinh(a) yo, €2= 72, €3 = 73. (5.84) 
According to this observer the photons arrive at an angle 


n-ez sin(0) 


tan(6’) = n:e  cosh(a) cos(8) + sinh(a) ` 


(5.85) 
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A straightforward rearrangement gives 


n _ cosh(a@) cos(#) + sinh(a@) _ cos(@) + 8 
co cosh(a) + sinh(a) cos(#) 1+ Gcos(@)’ (an) 


so observers in relative motion measure different angles to a fixed light source. 
This effect can be seen in observations of stars from the Earth. The Earth’s or- 
bital velocity around the sun has a 8 of roughly 1074 so to a good approximation 
we have 


cos(0') =~ cos(0) + 2 sin? (0). (5.87) 
The aberration angle ¢ = 0 — 0’ satisfies the approximate formula 
o ~ Bsin(6), (5.88) 


which implies that the aberration varies over a year as 0 varies through a complete 
cycle. This variation was first observed by James Bradley in 1727 and was 
explained in terms of a particle model of light. Bradley was able to use his data 
to give an improved estimate of the speed of light, though the full relativistic 
relation of (5.86) cannot be checked in this manner. 


5.4 The Lorentz group 


The full Lorentz group consists of the transformation group for vectors that pre- 
serves lengths and angles. These include reflections and rotations. A reflection 
in the hyperplane perpendicular to n is achieved by 


at —nan™!. (5.89) 


The n7! is necessary to accommodate both timelike n? > 0 and spacelike n? < 0 
cases. We cannot have null n, as the inverse does not exist. A timelike n 
generates time-reversal transformations, whereas spacelike reflections preserve 
time-ordering. Pairs of either of these result in a transformation which preserves 
time-ordering. However, a combination of one spacelike and one timelike re- 
flection does not preserve the time-ordering. The full Lorentz group therefore 
contains four sectors (table 5.1). 

The structure of the Lorentz group is easily understood in the spacetime alge- 
bra. We concentrate on even numbers of reflections, which have determinant +1 
and correspond to type J and type IV transformations. The remaining types 
are obtained from these by a single extra reflection. If we combine even numbers 
of reflections we arrive at a transformation of the form 


at papt, (5.90) 


where wW is an even multivector. This expression is currently too general, as we 
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Parity preserving Space reflection 


I II 
Time order preserving Proper I with space 
orthochronous reflection 
III IV 
Time reversal I with time I with 
reversal am —a 


Table 5.1 The full Lorentz group. The group of Lorentz transforma- 
tions falls into four disjoint sectors. Sectors J and IV have determinant 
+1, whereas JI and III have determinant —1. Both J and II preserve 
time-ordering, and the proper orthochronous transformations (type I) are 
simply-connected to the identity. 


have not ensured that the right-hand side is a vector. To see how to do this we 
decompose 7 into invariant terms. We first note that 


pb = (W) (5.91) 


so ww is even-grade and equal to its own reverse. It can therefore only contain 
a scalar and a pseudoscalar, 


wb = a + Iaz = pe!®, (5.92) 


where p Æ 0 in order for Y7! to exist. We can now define a rotor R by 


R= pl P H, (5.93) 
so that 
RR = (pf) =1, (5.94) 
as required. We now have 
y= pl2elB/2R yo = p'/2—18/2R (5.95) 


and our general transformation becomes 
a = e!8/2 Rae!8/2R = ce! RaR. (5.96) 


The term RaR is necessarily a vector as it is equal to its own reverse, so we must 
restrict @ to either 0 or 7, leaving the transformation 


at +RaR. (5.97) 


The transformation a ++ RaR preserves causal ordering as well as parity. 
Transformations of this type are called ‘proper orthochronous’ transformations. 


144 


5.4 THE LORENTZ GROUP 


We can prove that transformations parameterised by rotors are proper orthoch- 
ronous by starting with the velocity yo and transforming it to v = Ryo R. We 
require that the yo component of v is positive, that is, 


yw = (RoR) > 0. (5.98) 
Decomposing in the yo frame we can write 
R=a+a+I1b+I6 (5.99) 
and we find that 
(WRR) = a? +a? +b? + 6? >0 (5.100) 


as required. Our rotor transformation law describes the group of proper or- 
thochronous transformations, often called the restricted Lorentz group. These 
are the transformations of most physical relevance. The negative sign in equa- 
tion (5.97) corresponds to 3 = m and gives class-IV transformations. 


5.4.1 Invariant decomposition and fixed points 
Every rotor in spacetime can be written in terms of a bivector as 


R= eP% (5.101) 


(The minus sign is rarely required, and does not affect the vector transformation 
law.) We can understand many of the features of spacetime transformations 
and rotors through the properties of the bivector B. The bivector B can be 
decomposed in a Lorentz-invariant manner by first writing 


B? = (B?)o + (B’)4 = pel®, (5.102) 


and we will assume that p 4 0. (The case of a null bivector is treated slightly 
differently.) We now define 


Ê =p V%e14/2B, (5.103) 
so that 
Ê? = pte’ B? =1. (5.104) 
With this we can now write 
B = p'/e!9/2B = aB + BIB, (5.105) 


which decomposes B into a pair of bivector blades, aB and BI B. Since 
B(IB) = (IÊ) =I, (5.106) 
the separate bivector blades commute. The rotor R now decomposes into 


R = 00B/2.81B/2 — (B1B/2 08/2. (5.107) 
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Figure 5.5 A timelike plane. Any timelike plane B, Ê? = 1, contains two 
null vectors n+ and n_. These can be normalised so that n-An_— = 2B. 


exhibiting an invariant split into a boost and a rotation. The boost is generated 
by B and the rotation by I Ê. 

For every timelike bivector Ê : B= 1, we can construct a pair of null vectors 
n4 satisfying 


Beng = nx. (5.108) 


These are necessarily null, since 
none = (B-n4) n} = B- (n}^n4)=0, (5.109) 
with the same holding for n_. The two null vectors can also be chosen so that 
n ^n = 2B, (5.110) 


so that they form a null basis for the timelike plane defined by B (see figure 5.5). 


The null vectors n+ anticommute with Ê and therefore commute with IB. 
The effect of the Lorentz transformation on n+ is therefore 


RniR = e2b/2n e7282 


= etan]. (5.111) 


The two null directions are therefore just scaled — their direction is unchanged. 
It follows that every Lorentz transformation has two invariant null directions. 
The case where the bivector generator itself is null, B? = 0, corresponds to the 
special situation where these two null directions coincide. 


5.4.2 The celestial sphere 


One way to visualise the effect of Lorentz transformations is through their effect 
on the past light-cone (see figure 5.6). Each null vector on the past light-cone 
maps to a point on the sphere S~ — the celestial sphere for the observer. Suppose 
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Figure 5.6 The celestial sphere. Each observer sees events in their past 
light-cone, which can be viewed as defining a sphere (shown here as a circle 
in a plane). 


then that light is received along the null vector n, with the observer’s velocity 
chosen to be yo. The relative vector in the yo frame is n^yo. This has magnitude 


(nay)? = (n0)? — n? = (n-70)?. (5.112) 


We therefore define the unit relative vector n by the projective formula 


pe (5.113) 
nYo 


Observers passing through the same spacetime point at different velocities see 
different celestial spheres. If a second observer has velocity v = Ryo R, the unit 
relative vectors in this observer’s frame are formed from nAvu/n-v. These can be 
brought to the yo frame for comparison by forming 

vas pre = n' Ayo 


nV mM: 


(5.114) 
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where n’ = RnR. The effects of Lorentz transformations can be visualised simply 
by moving around points on the celestial sphere with the map n > RnR. We 
know immediately, then, that two directions remain invariant and so describe 
the same points on the celestial spheres of two observers. 


5.4.3 Relativistic visualisation 


We have endeavoured to separate the concept of a single observer from that of a 
coordinate lattice. A clear illustration of this distinction arises when one studies 
how bodies appear when seen by different observers. Concentrating purely on 
coordinates leads directly to the conclusion that there is a measurable Lorentz 
contraction in the direction of motion of a body moving relative to some coor- 
dinate system. But when we consider what two different observers actually see, 
the picture is rather different. 

Suppose that two observers in relative motion observe a sphere. The sphere 
and one of the observers are both at rest in the yo system. This observer sees 
the edge of the sphere as a circle defined by the unit vectors 


n = sin(@)(cos(¢) a1 + sin(¢) o2) + cos(0) o3, O< ¢< 2r. (5.115) 


The angle 6 is fixed so the sphere subtends an angle 20 on the sky and is centred 
on the 3 axis (see figure 5.7). The incoming photon paths from the sphere are 
defined by the family of null vectors 


n=(1—n)%. (5.116) 


Now suppose that a second observer has velocity 6 = tanh(a) along the 1 axis, 
so 


v = cosh(a) yo + sinh(a) yı = RoR, (5.117) 
where R = exp(a7170/2). To compare what these two observers see we form 
n = RnR =cosh(a)(1 + Bsin(@) cos())yo — cosh(a) (sin(9) cos(¢) + 8) 71 
— sin(@) sin(¢) y2 — cos(@) 73. (5.118) 
And from this the new unit relative outward vector is 


e cosh(a)(sin(8) cos(¢) + 8)o1 + sin(8) sin(¢) o2 + cos(8) o3 


cosh(a) (1 + @sin(@) cos(¢)) only) 
Now consider the vector 
c= o3 + sinh(a) cos(6) o1. (5.120) 
This vector satisfies 
cn’ = cosh(a) cos(8), (5.121) 
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Figure 5.7 Relativistic visualization of a sphere. The sphere is at rest in 
the yo frame with its centre a unit distance along the 3 axis. The sphere 
is simultaneously observed by two observers placed at the spatial origin. 
One observer is at rest in the yo system, and the other is moving along the 
1 axis. 


which is independent of ¢. It follows that, from the point of view of the second 
observer, all points on the edge of the sphere subtend the same angle to c. So 
the vector c must lie at the centre of a circle, and the second observer still sees 
the edge of the sphere as circular. That is, both observers see the sphere as 
a sphere, and there is no observable contraction along the direction of motion. 
The only difference is that the moving observer sees the angular diameter of the 
sphere reduced from 20 to 26’, where 


cos(@) cosh(a) 


cos(6") = a 1/2? 
(1 + sinh”(a) cos?(@)) (5.122) 
tan(6") = ae. 


More generally, moving observers see solid objects as rotated, as opposed to 
contracted along their direction of motion. Visualising Lorentz transformations 
of solid objects has now been discussed by various authors (see Rau, Weiskopf 
& Ruder (1998)). But the original observation that spheres remain spheres 
for observers in relative motion had to wait until 1959 — more than 50 years 
after the development of special relativity! The first authors to point out this 
invisibility of the Lorentz contraction were Terrell (1959) and Penrose (1959). 
Both authors based their studies on the fact that the Lorentz group is isomorphic 
to the conformal group acting on the surface of a sphere. This type of geometry 
is discussed in chapter 10. 
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5.4.4 Pure boosts and observer splits 


Suppose we are travelling with velocity u and want to boost to velocity v. We 
seek the rotor for this which contains no additional rotational factors. We have 


v= LuL (5.123) 


with La, L = a, for any vector outside the uAv plane. It is clear that the 
appropriate bivector for the rotor is uAv, and as this anticommutes with u and 


v we have 
v= Lul= Lu > Ll? =u. (5.124) 
The solution to this is 
1+vu a vAU 
L = — = — 5.125 
[2(1 + u-v)]!/2 PAP (3 a , ( ) 


where the angle a is defined by cosh(a) = u-v. 
Now suppose that we start in the yo frame and some arbitrary rotor R takes 
this to v = Ryo R. We know that the pure boost for this transformation is 


= aaron? (Seal) re 
where v:yo = cosh(a). Now define the further rotor U by 
U=ELR, UU=LRRL=1. (5.127) 
This satisfies 
UyoU = LuL = 0, (5.128) 


so Uy = yU. We must therefore have U = exp(Ib/2), where Ib is a relative 
bivector, and U generates a pure rotation in the yo frame. We now have 


R= LU, (5.129) 


which decomposes R into a relative rotation and boost. Unlike the invariant 
decomposition into a boost and rotation of equation (5.107), the boost L and 
rotation U will not usually commute. The fact that the LU decomposition ini- 
tially singled out the yo vector shows that the decomposition is frame-dependent. 
Both the invariant split of equation (5.107) and the frame-dependent split of 
equation (5.129) are useful in practice. 


5.5 Spacetime dynamics 


Dynamics in spacetime is traditionally viewed as a hard subject. This need not 
be the case, however. We have now established that Lorentz transformations 
which preserve parity and causal structure can be described with rotors. By 
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parameterising the motion in terms of rotors many equations are considerably 
simplified, and can be solved in new ways. This provides a simple understanding 
of the Thomas precession, as well as a new formulation of the Lorentz force law 
for a particle in an electromagnetic field. 


5.5.1 Rotor equations and Fermi transport 


A spacetime trajectory z(7) has a future-pointing velocity vector & = v. This is 
normalised to v? = 1 by parameterising the curve in terms of the proper time. 
This suggests an analogy with rigid-body dynamics. We write 


v = RyoR, (5.130) 


which keeps v future-pointing and normalised. This moves all of the dynamics 
into the rotor R = R(r), and this is the key idea which simplifies much of 
relativistic dynamics. The next quantity we need to find is the acceleration 


d x A sg à 
v= q EVR) = RyR+ RoR. (5.131) 
But just as in three dimensions, RR is of even grade and is equal to minus its 


reverse, so can only contain bivector terms. We therefore have 
ù = RŘv — vRŘ 
=2(RR)-v. (5.132) 


This equation is consistent with the fact that v-ù = 0, which follows from v? = 1. 
If we now form the acceleration bivector we obtain 


bv = 2(RR)-vv. (5.133) 


This determines the projection of the bivector into the instantaneous rest frame 
defined by v. In this frame the projected bivector is purely timelike and cor- 
responds to a pure boost. The remaining freedom in RR corresponds to an 
additional rotation in R which does not change v. 

For the purposes of determining the velocity and trajectory of a particle the 
component of RR perpendicular to v is of no relevance. In some applications, 
however, it is useful to attach physical significance to the comoving frame vectors 


{eu}; 


e, = Ry, R, (5.134) 
which have eg = v. The spatial set of vectors {e;} satisfy e;-v = 0 and span the 
instantaneous rest space of v. In this case, the dynamics of the e; can be used 
to determine the component of RR which is not fixed by v alone. 

The vectors {e;} are carried along the trajectory by the rotor R. They are said 
to be Fermi-transported if their transformation from one instant to the next is 
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Figure 5.8 The proper boost. The change in velocity from 7 to T + 67 
should be described by a rotor solely in the v/v plane. 


a pure boost in the v frame. In this case the {e;} vectors remain ‘as constant as 
possible’, subject to the constraint e;-v = 0. For example, the direction defined 
by the angular momentum of an inertial guidance gyroscope (supported at its 
centre of mass so there are no torques) is Fermi-transported along the path of 
the gyroscope through spacetime. 

To ensure Fermi-transport of Ry,R we need to ensure that the rotor describes 
pure boosts from one instant to the next (see figure 5.8). To first order in 67 we 
have 


u(t + 67) = (7) + 670. (5.135) 
The pure boost between u(r) and u(r + br) is determined by the rotor 


1+v(7 + 67)v(T) 


L= =1+46ri 1 
[2(1 + u(r + ôT) -v(7))}4/2 gore (P190) 
to first order in ôr. But since 
R(T + 6r) = R(T) + ôrR(T) = (1 + 67 RR)R(1), (5.137) 


the additional rotation that takes the {e;} frame from 7 to T + ôr is described 
by the rotor 1+67RR. Equating this to the pure boost L of equation (5.136), 
we find that the correct expression to ensure Fermi-transport of the {e;} is 


RR = tov. (5.138) 


This is as one would expect. The bivector describing the change in the rotor is 
simply the acceleration bivector, which is the acceleration seen in the instanta- 
neous rest frame. 
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Under Fermi-transport the {e;} frame vectors satisfy 
è; = 2(RR)-e; = —e; (ùv). (5.139) 
This leads directly to the definition of the Fermi derivative 


D 
5, = ata (iv). (5.140) 


The Fermi derivative of a vector vanishes if the vector is Fermi-transported along 
the worldline. The derivative preserves both the magnitude a? and a-v. The 
former holds because 
d 
dt 


Conservation of a- v is also straightforward to check: 


(a?) = —2a-(a-(wAv)) = 0. (5.141) 


d . . 
J (a-v) (a (ùv)) v +a ù 
=-atv+avt-v+a-0=0. (5.142) 
It follows that if a starts perpendicular to v it remains so. In the case where 
a-v = 0 the Fermi derivative takes on the simple form 


D 
se ataiv=a—a-vv = avr. (5.143) 
Dr 


This is the projection of å perpendicular to v, as expected. The Fermi derivative 
extends simply to multivectors as follows: 


DM dM 
—— = — Nu). 144 
Dr d7 + M x (òv) (5 ) 


Derivatives of this type are important in gauge theories and gravity. 


5.5.2 Thomas precession 
As an application, consider a particle in a circular orbit (figure 5.9). The world- 
line is 
u(t) = t(T)yo + a(cos(wt)y1 + sin(wt)y2), (5.145) 
and the velocity is 
v = i = (yo + aw(— sin(wt)yı + cos(wt)y2)). (5.146) 


The relative velocity as seen in the yo frame, v = vA /v-yo, has magnitude 
|v| = aw. We therefore introduce the hyperbolic angle a, with 


tanh(a) =aw, +t =cosh(a). (5.147) 
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Figure 5.9 Thomas precession. The particle follows a helical worldline, 


rotating at a constant rate in the yo frame. 


The velocity is now 
v = cosh(a) yo + sinh(a) (— sin(wt)yı + cos(wt)y2) 


a 0/2, g on /2 


where 


n = —sin(wt)o, + cos(wt)o2. 


(5.148) 


(5.149) 


This form of time dependence in the rotor is inconvenient to work with. To 


simplify, we write 
n=e “tlesg, = R,o2R,,, 
where R,, = exp(—wtIo3/2). We now have 
0? /2 — exp(aR,o2R,,/2) = Ro Rak, 
where 
Ra = expla 2/2). 
The velocity is now given by 
v = Ro RaRo Rw RaRo = Ro RatoRa Ru. 


The final expression follows because R,, commutes with yo. 
We can now see that the rotor for the motion must have the form 


R=R,R,®, 
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(5.151) 


(5.152) 


(5.153) 
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where ® is a rotor that commutes with yo. We want R to describe Fermi trans- 
port of the {e;}, so we must have tv = 2RR. We begin by forming the accelera- 
tion bivector ùv. We can simplify this derivation by writing v = R.,.va Ry, where 
Va = RayoRa. We then find that 
bv = Ru (UB Ru) Va va) Ruy 

= —w cosh(a) R,,((Io3)-va va) Ru 

= w sinh(a) cosh(a) Ry (— cosh(a) o1 + sinh(a) Jo) Ru. (5.155) 
We also form the rotor equivalent, 2RŘ, which is 

2RR = 2Ro Ro + 2R Robb Ra Ro 
= —w cosh(a) Ios + 2R Rab Ra Ro. (5.156) 

Equating the two preceding results we find that 


26 = w cosh? (a) Ra (—sinh(a) o1 + cosh(a) Io3) Ra 
= w cosh’ (a) Io3. (5.157) 


The solution with ® = 1 at t = 0 is ® = exp(wcosh(a)tIo3/2), so the full rotor 
is 
R= e vt 3/2 ,.002/2cosh(a)wtlo3/2. (5.158) 


This form of the rotor ensures that the e; = Ry,R are Fermi transported. The 
fact that the ‘internal’ rotation rate wcosh(a) differs from w is due to the fact 
that the acceleration is formed in the instantaneous rest frame v and not the fixed 
yo frame. This difference introduces a precession — the Thomas precession. We 
can see this effect by imagining the vector 7; being transported around the circle. 
The rotated vector is 


e = Ry R. (5.159) 


In the low velocity limit cosh(a) ++ 1 the vector yı continues to point in the yı 
direction and the frame does not rotate, as we would expect. At larger velocities, 
however, the frame starts to precess. After time t = 27/w, for example, the yı 
vector is transformed to 


e (27 /w) = e27 2/2627 cosh(a)lo3,,, 7202/2, (5.160) 

Dotting this with the initial vector e1 (0) = yı we see that the vector has precessed 
through an angle 

0 = 27(cosh(a) — 1). (5.161) 

This shows that the effect is of order |v|?/c?. The form of the Thomas precession 


justifies one of the relativistic corrections to the spin-orbit coupling in the Pauli 
theory of the electron. 
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5.5.3 The Lorentz force law 
The non-relativistic form of the Lorentz force law for a particle of charge q is 
dp _ 
dt 
where the x here denotes the vector cross product, and all relative vectors are 
expressed in some global Newtonian frame, which we will take to be the yo frame. 


We seek a covariant relativistic version of this law. The quantity p on the left- 
hand side is the relative vector pA7yo. Since dt = ydr, we must multiply through 


q(E +vxB), (5.162) 


by y = v-yo to convert the derivative into one with respect to proper time. The 
first term on the right-hand side then includes 


v:o E = ¢(E(vyo + yv) + (vyo + yv) E) 
= į ((Ev — vE)% — (Ev — vE)) 
= (E-v) A. (5.163) 
Recall at this point that E is a spacetime bivector built from the a, = YkYo, so 
E anticommutes with yo. 
For the magnetic term in equation (5.162) we first replace the cross product 


by the equivalent three-dimensional expression (JB)-v. Expanding out, and 
expressing in the full spacetime algebra, we obtain 


4v-799(1Bv — vIB) = 4 (IB(vyo — yov) — (vyo — ov) JB) 
= 1 ((IBv — vI B)yo — yo(1 Bu — vI B)) 
= ((IB)-v) ^0, (5.164) 


where we use the fact that yo commutes with IB. Combining equations (5.163) 
and (5.164) we can now write the Lorentz force law (5.162) in the form 


d : 
TP = pat = 4((E + IB)-v) Ao. (5.165) 
We next define the Faraday bivector F by 

F=E+I1B. (5.166) 


This is the covariant form of the electromagnetic field strength. It unites the 
electric and magnetic fields into a single spacetime structure. We study this in 
greater detail in chapter 7. The Lorentz force law can now be written 


PAY = UF -v)A0. (5.167) 


The rate of working on the particle is gE-v, so 


Po = gE-v. (5.168) 
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Here, po = p-7yo is the particle’s energy in the yo frame. Multiplying through by 
v-yo, we find 


B-yo = qE: (v^) = a(F-v)-Yo. (5.169) 

In the final step we have used (IB)-(vA yo) = 0. Adding this equation to 
equation (5.167), and multiplying on the right by yo, we find 

p= qF -v. (5.170) 

Recalling that p = mv, we arrive at the relativistic form of the Lorentz force law, 

mò = qF-v. (5.171) 


This is manifestly Lorentz covariant, because no particular frame is picked out. 
The acceleration bivector is 


iv = LFvv= L (Fv)w = LE, (5.172) 
m m m 


where E, is the relative electric field in the v frame. A charged point particle 
only responds to the instantaneous electric field in its frame. Algebraically, this 
bivector is 


E, = 4(F —vFv). (5.173) 
So E, is the component of the bivector F which anticommutes with v. 


Now suppose that we parameterise the velocity with a rotor, so that v = RR. 
We have 


ò =2RRv = 2(RR)-v = Fv. (5.174) 
m 
The simplest form of the rotor equation comes from equating the projected terms: 
R= LFR. (5.175) 
2m 


This is not the most general possibility as we could include an extra multiple of 
FAvv. The rotor determined by equation (5.175) will not, in general, describe 
Fermi-transport of the Ry;Ř vectors. However, equation (5.175) is sufficient to 
determine the velocity of the particle, and is certainly the simplest form of rotor 
equation to work with. As we now demonstrate, the rotor equation (5.175) is 
remarkably efficient when it comes to solving the dynamical equations. 


5.5.4 Constant field 


Motion in a constant field is easy to solve for now. We can immediately integrate 
the rotor equation to give 


R=exp (Fr) Ro. (5.176) 
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Figure 5.10 Particle in a constant field. The general motion is a combi- 
nation of linear acceleration and circular motion. The plot on the left has 
E and B colinear. The plot on the right has E entirely in the JB plane, 
giving rise to cycloids. 


To proceed and recover the trajectory we form the invariant decomposition of 
F. We first write 


F? = (F°) + (F)4 = pe”, (5.177) 
so that we can set 
F = PeP — oF + IGF, (5.178) 


where F? = 1. (If F is null a slightly different procedure is followed.) We now 
have 


R = exp (afr) exp (4 18F 7) Ro. (5.179) 
2m 2m 
Next we decompose the initial velocity vg = Royo Ro into components in and out 
of the F plane: 
vo = F?vo = Ê F -vo + Ê Ê Mvo = voj + vou. (5.180) 


Now voj = FF -vo anticommutes with Ê , and vg, commutes with Ê , SO 


t = exp (Zafr) Vol + exp (413Fr) voL- (5.181) 
m m 
This integrates immediately to give the particle history 
egafFr/m Sie egbIÊrT/m ST 


qßb/m 


The first term gives linear acceleration and the second is periodic and drives 
rotational motion (see figure 5.10). One has to be slightly careful integrating the 
velocity equation in the case where either a or ĝ is zero, which corresponds to 
perpendicular E and B fields. 


F)-v9. (5.182) 


T zo = 


ga/m 
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5.5.5 Particle in a Coulomb field 


As a further application we consider the case of a charged point particle moving 
in a central Coulomb field. If relativistic effects are ignored the problem reduces 
to the inverse-square force law described in section 3.2.1. We therefore expect 
that the relativistic description will add additional perturbative effects to the 
elliptic and hyperbolic orbits found in the inverse-square case. We assume for 
simplicity that the central charge has constant velocity yo and is placed at the 
origin. The electromagnetic field is 
Qr 

Ara (5.183) 
where « = rA7 and r? = x?. In this section all bold symbols denote relative 
vectors in the yọ frame. The question of how to generalise the non-relativistic 
definitions of centre of mass and relative separation turns out to be surprisingly 
complex and is not tackled here. Instead we will simply assume that the source of 
the Coulomb field is far heavier than the test charge so that the source’s motion 
can be ignored. 

There are two constants of motion for this force law. The first is the energy 


(5.184) 


If the charges are opposite, qQ is negative and the potential is attractive. The 
force law can now be written in the yo frame as 


de qQx E qQ ) 
m = è 


dr? Aregr? \m 4reomr 


(5.185) 


The second conserved quantity is the angular momentum, which is conserved for 
any central force, as is the case in equation (5.185). If we define the spacetime 
bivector L = xAp we find that 


L=qt\(F-v). (5.186) 
It follows that the trivector [Ayo is conserved. Equivalently, we can define the 
relative bivector 


so that the relative vector lis conserved. This is the relative angular momentum 
vector and satisfies æx-l = 0. It follows that the test particle’s motion takes place 
in a constant plane as seen from the source charge. 

In order to integrate the rotor equation we need to find a way to express the 
field as a function of the particle’s proper time. This is achieved by introducing 
an angular measure in the plane of motion. Suppose that we align the 3 axis 
with l, so that we can write 


&(T) = a, exp(Io34(r)), (5.188) 
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where ĉ is the unit relative vector æ/r. It follows that 
P = mrii? = m?rt6?. (5.189) 


If we set 1 = |I| we have | = mr?6, which enables us to express the Coulomb field 
as 


ĝo exp(Io30 
fe CMe Ree), (5.190) 
Areol 
If we now let 
qQ 
ES (5.191) 
the rotor equation takes on the simple form 
dR k 
wa exp(Io30)R. (5.192) 


Re-expressing the differential equation in terms of 0 is a standard technique for 
solving inverse-square problems in non-relativistic physics. But this technique 
fails to give a simple solution to the relativistic equation (5.185). Instead, we 
see that the technique gives a simple solution to the relativistic problem only if 
applied directly to the rotor equation. 

To solve equation (5.192) we first set 


R = exp(—I030/2)U. (5.193) 
It follows that 
Xg = $(Kko, + Ios), (5.194) 
which integrates straightforwardly. The full rotor is then 
R = 7 1038/2¢48/2 Ro, (5.195) 
where 
A = kO; + Io3. (5.196) 


The initial conditions can be chosen such that 0(0) = 0, which tells us how to 
align the 1 axis. The rotor Ro then specifies the initial velocity vo. If we are not 
interested in transporting a frame, Ro can be set equal to a pure boost from yo 
to vo. 

With the rotor equation now solved, the velocity can be integrated to recover 
the trajectory. Clearly, different types of path are obtained for the different signs 
of A? = k? — 1. The equation relating r and @ is found from the relation 


d (1 m, 
= C) = ee. (5.197) 
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To evaluate the right-hand side we need 


eae (e127 39/2. 1739/2 Ray Ryo) 


= — (4049/2 yy¢- 49/2) 

= (e491 99). (5.198) 
It follows that 

n G) = T (e741). (5.199) 


For a given / and vo this integrates to give the trajectory in the Il plane. 
Suppose, for example, that we are interested in bound states. For these we 
must have A? < 0, which implies that x? < 1. We write 


|A| = (1 — n?)¥/? (5.200) 


for the magnitude of A. To simplify the equations we will assume that 7 = 0 
corresponds to a point on the trajectory where v is perpendicular to x. In this 
case we have 


vo = cosh(ag) yo + sinh(ag) y2 (5.201) 
so that the trajectory is determined by 
1 
2 G) = TA (« cosh(ao) + sinh(ao)) sin(|A]9). (5.202) 


The magnitude of the angular momentum is given by l = mro sinh(ao), which 
can be used to write 
m(r cosh(ao) + sinh(ao)) = (E? — m?|Al?)¥/?, (5.203) 
The trajectory is then given by 
YAP 2 _ 2) 4]2)1/2 
= —KE + (E* — m*| A|“) 4 cos({Al@), (5.204) 
r 


and since this represents a bound state, « must be negative. The fact that the 
angular term goes as cos(|A|9) shows that this equation specifies a precessing 
ellipse (figure 5.11). The precession rate of the ellipse can be found simply using 
the technique of section 3.3. 


5.5.6 The gyromagnetic moment 


Particles with non-zero spin have a magnetic moment which is proportional to 
the spin. In non-relativistic physics we write this as m = ys, where y is the 
gyromagnetic ratio and s is the spin (which has units of angular momentum). 
The gyromagnetic ratio is usually written in the form 


ey (5.205) 
2m 
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Figure 5.11 Motion in a Coulomb field. For bound orbits (E < m) the 
particle’s motion is described by a precessing ellipse. The plot is for |A| = 
0.95. The units are arbitrary. 


where m is the particle mass, q is the charge and g is the (reduced) gyromagnetic 
ratio. The last is determined experimentally via the precession of the spin vector 
which, in classical physics, obeys 

q 


5, UB): s: (5.206) 


=g 
We seek a relativistic extension of this equation. We start by introducing the 
relativistic spin vector s, which is perpendicular to the velocity v, so s -v = 0. 
For a particle at rest in the yo frame we have s = sy. The particle’s spin 
will interact with the magnetic field only in the instantaneous rest frame, so we 
should regard equation (5.206) as referring to this frame. 
Given that s = syo we find that 


(IB)-s = ((FA70)0870)2 
= (F-s)A%. (5.207) 
So, for a particle at rest in the yo frame, equation (5.206) can be written 


ds q 

— =g=—(F: i 5.208 

di 95, ( 5s)^A 0 Yo ( ) 
To write down an equation which is valid for arbitrary velocity we must replace 
the two factors of yo on the right-hand side with the velocity v. On the left-hand 


side we need the derivative of s which preserves s -v = 0. This is the Fermi 
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derivative of section 5.5.1, which tells us that the relativistic form of the spin 
precession equation is 


è + s- (ùv) = 95-(F-s)Avv, (5.209) 


This equation tells us how much the spin vector rotates, relative to a Fermi- 
transported frame, which is physically sensible. We can eliminate the accelera- 
tion bivector ùv by using the relativistic Lorentz force law to find 


$= 95-(F-s)Avv — Ts (Fvu) 


= (g(F-s)Auv + 2(F-s)-0)0 
= T F-s+(g—2)5—(F-s)Avv. (5.210) 


This is called the Bargmann—Michel-Telegdi equation. 
For the value g = 2, the Bargmann—Michel—Telegdi equation reduces to 


$= L F.s, (5.211) 
m 


which has the same form as the Lorentz force law. In this sense, g = 2 is the 
most natural value of the gyromagnetic ratio of a point particle in relativistic 
physics. Ignoring quantum corrections, this is indeed found to be the value for an 
electron. Quantum corrections tell us that for an electron g = 2(1+a/2r+---). 
The corrections are due to the fact that the electron is never truly isolated and 
constantly interacts with virtual particles from the quantum vacuum. 

Given a velocity v and a spin vector s, with v-s = 0 and s normalised to 


s? = —1, we can always find a rotor R such that 


v=RyR, s= RyŘ. (5.212) 

For these we have 
ù= XAYRR)v, §=2(RR)-s. (5.213) 
For a particle with g = 2, this pair of equations reduces to the single rotor 
equation (5.175). The simple form of this equation further justifies the claim 
that g = 2 is the natural, relativistic value of the gyromagnetic ratio. This 
also means that once we have solved the rotor equation, we can simultaneously 


compute both the trajectory and the spin precession of a classical relativistic 
particle with g = 2. 


5.6 Notes 


There are many good introductions to special relativity. Standard references 
include the books by French (1968), Rindler (1977) and d’Inverno (1992). Prac- 
tically all introductory books make heavy use of coordinate geometry. Geometric 
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algebra was first systematically applied to the study of relativistic physics in the 
book Space-Time Algebra by Hestenes (1966). Since this book was published 
in 1966 many authors have applied spacetime algebra techniques to relativistic 
physics. The two most significant papers are again by Hestenes, ‘Proper parti- 
cle mechanics’ and ‘Proper dynamics of a rigid point particle’ (1974a,b). These 
papers detail the use of rotor equations for solving problems in electrodynamics, 


and much of section 5.5 follows their presentation. 


5.1 


5.2 


5.3 


5.7 Exercises 


Suppose that the spacetime bivector Ê satisfies B2 = 1. By writing 
B =a + Íb in the yo frame, show that we can write 


Ê = cosh(u)a + sinh(u) Tb = etâ â, 


aD A N 
where @ = 6 = 1. Hence explain why we can write Ê = Ro3R. By 
considering the null vectors yo + y3, prove that we can always find two 


null vectors satisfying 


B-n = EN+. 


The boost L from velocity u to velocity v satisfies 
v = Lul = Lu, 
with LL = 1. Prove that a solution to this equation is 


l+vu 
DETA 


Is this solution unique? Show further that this solution can be written 
in the form 


where a > 0 satisfies cosh(a) = u-v. 
Compton scattering occurs when a photon scatters off an electron. If 
we ignore quantum effects this can be modelled as a relativistic collison 
process. The incident photon has wavelength Xo in the frame in which 
the electron is initially stationary. Show that the wavelength after scat- 
tering, A, satisfies 

à — Ao = eu —cos(6)), 


mMc 


where 0 is the angle through which the photon scatters. 
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5.4 


5.5 


5.6 


5.7 


5.8 


A relativistic particle has velocity v = RyoR. Show that v satisfies the 
Lorentz force equation mù = qF -v if R satisfies 


R=- LFR. 


2m 


Show that the solution to this for a constant field is 
R=exp(qF'r/2m) Ro. 


Given that F is null, F? = 0, show that v is given by the polynomial 


q q? 
v = vw +T F- v- 7? FuoF. 
m 4m 


2 


Suppose now that F = o + Io, and the particle is initially at rest in 
the yo frame. Sketch the resultant motion in the y;73 plane. 

One way to construct the Fermi derivative of a vector a is to argue that 
we should ‘de-boost’ the vector at proper time T + ôr before comparing 
it with a(r). Explain why this leads us to evaluate 


pt Ah ie 
jim, a (La(r +ôr)L — a(T)), 
and confirm that this evaluates to å + a- (vv). 

A frame is Fermi-transported along the worldline of a particle with ve- 
locity v = Ryo R. The rotor R is decomposed into a rotation and boost 
in the yo frame as R = LU. Show that the rotation U satisfies 


2UU = -(LL + yoLL). 
What is the interpretation of the right-hand side in terms of the yo 
frame? 
The bivector B = aAb is Fermi-transported along a worldline by Fermi- 
transporting the two vectors a and b. Show that B remains a blade, and 
that the bivector satisfies 


dB 

AF + Bx (wv) = 0. 
A point particle with a gyromagnetic ratio g = 2 is in a circular orbit 
around a central Coulomb field. Show that in one complete orbit the 
spin vector rotates in the plane A = ko, + Io3 by an amount 27|Al, 
where 


qQ 


a Atregl’ 


and l is the angular momentum. 
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5.9 


Show that the Bargmann—Michel—Telegdi equation of (5.210) for a rela- 
tivistic point particle with spin vector s can be written 


. q 
$= mae L(g —2)FAvv):s. 


Given that v = Ryo R and s = Ry3R, show that the rotor R satisfies the 
equation 
FR+-“(g—2)RIBo, 


2m 4m 


where 
IBo = (RFR) A 00. 


Assuming that the electromagnetc field F is constant, prove that Bo 
is also constant. Hence study the precession of s for a particle with a 
gyromagnetic ratio g # 2. 
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Geometric calculus 


Geometric algebra provides us with an invertible product for vectors. In this 
chapter we investigate the new insights this provides for the subject of vector 
calculus. The familiar gradient, divergence and curl operations all result from 
the action of the vector operator, V. Since this operator is vector-valued, we 
can now form its geometric product with other multivectors. We call this the 
vector derivative. Unlike the separate divergence and curl operations, the vec- 
tor derivative has the important property of being invertible. That is to say, 
Green’s functions exist for V which enable initial conditions to be propagated 
off a surface. 

The synthesis of vector differentiation and geometric algebra described in this 
chapter is called ‘geometric calculus’. We will see that geometric calculus pro- 
vides new insights into the subject of complex analysis and enables the concept of 
an analytic function to be extended to arbitrary dimensions. In three dimensions 
this generalisation gives rise to the angular eigenstates of the Pauli theory, and 
the spacetime generalisation of an analytic function defines the wavefunction for 
a massless spin-1/2 particle. Clearly there are many insights to be gained from 
a unified treatment of calculus based around the geometric product. 

The early sections of this chapter discuss the vector derivative, and its asso- 
ciated Green’s functions, in flat spaces. This way we can quickly assemble a 
number of results of central importance in later chapters. The generalisations 
to embedded surfaces and manifolds are discussed in the final section. This is 
a large and important subject, which has been widely discussed elsewhere. Our 
presentation here is kept brief, focusing on the key results which are required 
later in this book. 
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6.1 The vector derivative 


The vector derivative is denoted with the symbol V (or V in two and three 
dimensions). Algebraically, this has all of the properties of a vector (grade-1) 
object in a geometric algebra. The operator properties of V are contained in the 
definition that the inner product of V with any vector a results in the directional 
derivative in the a direction. That is, 

E te OOS (6.1) 


er0 € 


where we assume that this limit exists and is well defined. Suppose that we 
now define a constant coordinate frame {ex} with reciprocal frame {e*}. Spatial 
coordinates are defined by x” = e* 
except where stated otherwise. The vector derivative can be written 


-x, and the summation convention is assumed 


y o 
k 


where we introduce the useful abbreviation 


o 
The frame decomposition V = efô, shows clearly how the the vector derivative 
combines the algebraic properties of a vector with the operator properties of the 
partial derivatives. It is a straightforward exercise to confirm that the definition 
of V is independent of the choice of frame. 


6.1.1 Scalar fields 


As a first example, consider the case of a scalar field (a). Acting on ¢, the vector 
derivative V returns the gradient, Vø. This is the familiar grad operation. The 
result is a vector whose components in the {e*} frame are the partial derivatives 
with respect to the x” coordinates. The simplest example of a scalar field is the 
quantity a-x, where a is a constant vector. We write a-x = x/a;, so that the 
gradient becomes 


a7 a; = ajó. (6.4) 


But the right-hand side simply expresses the vector a in the {e*} frame, so we 
are left with the frame-free result 


V(a-a) =a. (6.5) 


This result is independent of both the dimensions and signature of the vector 
space. Many formulae for the vector derivative can be built up by combining this 
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primitive result with the chain and product rules for differentiation. A particular 
application of this result is to the coordinates themselves, 


Va" = V(x-e*) = e, (6.6) 


a formula which generalises to curvilinear coordinate systems. 

As a second example, consider the derivative of the scalar x”. We first derive 
the result in coordinates before discussing a more elegant, frame-free derivation. 
We form 


V(x?) = e'd;(xI r*)e;-ex 


= 22, (6.7) 


which recovers the expected result. It is extremely useful to be able to perform 
such manipulations without reference to any coordinate frame. This requires a 
notation to keep track of which terms are being differentiated in a given expres- 
sion. A suitable convention is to use overdots to define the scope of the vector 
derivative. With this notation we can write 


V(x?) = V(a-2) + V(a-%) = 2V(é-2). (6.8) 


In the final term it is only the first factor of x which is differentiated, while the 
second is held constant. We can therefore apply the result of equation (6.5), 
which immediately gives V(x?) = 2x. More complex results can be built up in 
a similar manner. 

In Euclidean spaces V@ points in the direction of steepest increase of ø. This 
is illustrated in equation (6.5). To get the biggest increase in a-x for a given 
step size you must clearly move in the positive a direction, since moving in any 
orthogonal direction does not change the value. More generally, suppose V¢ = J 
and consider the contraction of this equation with the unit vector n, 


nVb=nd. (6.9) 


We seek the direction of n which maximises this value. Clearly in a Euclidean 
space this must be the J direction, so J points in the direction of greatest increase 
of ¢. Also, setting n in the J direction shows that the magnitude of J is simply 
the derivative in the direction of steepest increase. 

In mixed signature spaces, such as spacetime, this simple geometric picture 
can break down. As a simple example, consider a timelike plane defined by 
orthogonal basis vectors {yo, 71}, with y = 1 and 7? = —1. We introduce the 
scalar field 


ġo = (xy0x70) = (2°)? + (xt). (6.10) 
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Figure 6.1 Spacetime gradients. The contours of the scalar field ¢ = 
(xyoxyo) define circles in spacetime. But the direction of the vector deriv- 
ative is only in the outward normal direction along the 0 axis. Along the 
1 axis the gradient points inwards, which reflects the opposite signature. 
Around the circle the gradient interpolates between these two extremes. 
At points where zx is null the gradient vector is tangential to the circle. 


Contours of constant ¢ are circles in the spacetime plane, so the direction of 
steepest increase points radially outwards. But if we form the gradient of ¢ we 
obtain 


Vo =2V (e020) = 270270. (6.11) 


Figure 6.1 shows the direction of this vector for various points on the unit circle. 
Clearly the vector does not point in the direction of steepest increase of @. 
Instead, V@ points in a direction ‘normal’ to tangent vectors in the circle. In 
mixed signature spaces, the ‘normal’ does not point in the direction our Euclidean 
intuition is used to. This example should be borne in mind when we consider 
directed integration in spaces of mixed signature. (This example may appear 
esoteric, but closed spacetime curves of this type are of considerable importance 
in some modern attempts to construct a quantum theory of gravity.) 


6.1.2 Vector fields 


Suppose now that we have a vector field J(x). The full vector derivative VJ 
contains two terms, a scalar and a bivector. The scalar term is the divergence of 
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J(x). In terms of the constant frame vectors {ex} we can write 


k 
Vis by = OF _ a gt (6.12) 


T Bek T aa 
The divergence can also be defined in terms of the geometric product as 
V-J=}(VJ+ JV). (6.13) 


The simplest example of the divergence is for the vector x itself, for which we 
find 


V- == =n, (6.14) 


where n is the dimension of the space. 
The remaining, antisymmetric, term defines the exterior derivative of the vec- 
tor field. In terms of coordinates this can be written 


VAJ = èn (ðJ) = e' Ae! Oj Jj. (6.15) 


The components are the antisymmetrised terms in 0;J;. In three dimensions 
these are the components of the curl, though VAJ is a bivector, rather than an 
(axial) vector. (In this chapter we write vectors in two and three dimensions in 
bold face.) The three-dimensional curl requires a duality operation to return a 
vector, 


curl(J) = -I VAJ. (6.16) 


The exterior derivative generalises the curl to arbitrary dimensions. 
As an example, consider the exterior derivative of the position vector æ. We 
find that 


VAT = e' Ae; = e' Ae! (e;-e;) = 0, (6.17) 


which follows because e’/e/ is antisymmetric on i and j, whereas e;-e; is sym- 
metric. Again, we can give an algebraic definition of the exterior derivative in 
terms of the geometric product as 


VAJ = }(VJ — JV). (6.18) 


Equations (6.13) and (6.18) combine to give the familiar decomposition of a 
geometric product: 


VI=V-I+VAI. (6.19) 
So, for example, we have Vz = n. 
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6.1.3 Multivector fields 


The preceding definitions extend simply to the case of the vector derivative acting 
on a multivector field. We have 


VA = eR A, (6.20) 
and for an r-grade multivector field A, we write 


V-A, eWay: (6.21) 
VAA, = (VAr). (6.22) 


These define the interior and exterior derivatives respectively. The interior deriv- 
ative is often referred to as the divergence, and the exterior derivative is some- 
times called the curl. This latter name conflicts with the more familiar meaning 
of ‘curl’ in three dimensions, however, and we will avoid this name where possi- 
ble. 

An important result for the vector derivative is that the exterior derivative of 
an exterior derivative always vanishes, 


VA(VAA) = # nð; (ei A0;A) 
= e' Ae! \(0,0;A) = 0. (6.23) 


This follows because ee? is antisymmetric on i, j, whereas 0;0; A is symmetric, 
due to the fact that partial derivatives commute. Similarly, the divergence of a 
divergence vanishes, 


V-(V-A) =0, (6.24) 


which is proved in the same way, or by using duality. (By convention, the inner 
product of a vector and a scalar is zero.) 

Because V is a vector, it does not necessarily commute with other multivectors. 
We therefore need to be careful in describing the scope of the operator. We use 
the following series of conventions to clarify the scope: 


(i) In the absence of brackets, V acts on the object to its immediate right. 
(ii) When the V is followed by brackets, the derivative acts on all of the terms 
in the brackets. 
(iii) When the V acts on a multivector to which it is not adjacent, we use 
overdots to describe the scope. 


The ‘overdot’ notation was introduced in the previous section, and is invaluable 
when differentiating products of multivectors. For example, with this notation 
we can write 


V(AB) = VAB + VAB, (6.25) 
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which encodes a version of the product rule. If necessary, the overdots can be 
replaced with partial derivatives by writing 


VAB = e" AO, B. (6.26) 


Later in this chapter we also employ the overdot notation for linear functions. 
Suppose that f(a) is a position-dependent linear function. We write 


Vf(a) = Vf (a) — e"f (Opa), (6.27) 


so that Vf(a) only differentiates the position dependence in the linear function, 
and not in its argument. 

We can continue to build up a series of useful basic results by differentiating 
various multivectors that depend linearly on x. For example, consider 


V z- A, = e" e,-A,, (6.28) 
where A, is a grade-r multivector. Using the results of section 4.3.2 we find that 


Va-A,p =rAy,, 
VaNA, = (n—r)A,, (6.29) 
VArt = (—1)"(n — 2r)A,, 


where n is the dimension of the space. 


6.2 Curvilinear coordinates 


So far we have only expressed the vector derivative in terms of a fixed coordinate 
frame (which is usually chosen to be orthonormal). In many applications, how- 
ever, it is more convenient to work in a curvilinear coordinate system, where the 
frame vectors vary from point to point. A general set of coordinates consist of a 
set of scalar functions {x'(x)}, i =1,...,n, defined over some region. In this re- 
gion we can equally write x(x‘), expressing the position vector x parametrically 
in terms of the coordinates. If one of the coordinates is varied and all of the 
others are held fixed we specify an associated coordinate curve. The derivatives 
along these curves specify a set of frame vectors by 


_ or Lies a(al,...,@'+e,...,02")—2@ 
Ox? e0 € 


e; (2) 


where the ith coordinate is varied and all others are held fixed. The derivative 
in the e; direction, e;-V, is found by moving a small amount along e;. But this 
is precisely the same as varying the xê coordinate with all others held fixed. We 
therefore have 


l (6.30) 


EL g (6.31) 


GEOMETRIC CALCULUS 


In order that the coordinate system be valid over a given region we require that 
throughout this region 


€1A\egA---Aen #0. (6.32) 


As this quantity can never pass through zero it follows that the frame has the 
same orientation throughout the valid region. 

We can construct a second frame directly from the coordinate functions by 
defining 


e = Vr". (6.33) 
From their construction we see that the {e’} vectors have vanishing exterior 
derivative: 
VAe’ = VA(V2") = 0. (6.34) 
As the notation suggests, the two frames defined above are reciprocal to one 
another. This is straightforward to check: 


ee 
eiel = e Vri = i = ő. (6.35) 


This result is very useful because, when working with curvilinear coordinates, 


one usually has simple expressions for either z'(x) or x(x‘), but rarely both. 
Fortunately, only one is needed to construct a set of frame vectors, and the 
reciprocal frame can then be constructed algebraically (see section 4.3). This 
construction provides a simple geometric picture for the gradient in a general 
space. Suppose we view the coordinate x!(x) as a scalar field. The contours of 
constant x! are a set of (n—1)-dimensional surfaces. The remaining coordinates 
x?,...,2” define a set of directions in this surface. At each point on the surface 
of constant x! the vector Vx! is orthogonal to all of the directions in the surface. 
In Euclidean spaces this vector is necessarily orthogonal (normal) to the surface. 
In other spaces this construct defines what we mean by normal. 

Now suppose we have a function F(x) that is expressed in terms of the coor- 
dinates as F(x‘). A simple application of the chain rule gives 


VF = Vir’ ð F = '0,F. (6.36) 
This is consistent with the decomposition 
V a e'0; = e’e;-V, (6.37) 


which holds as the {e;} and {ef} are reciprocal frames. 


6.2.1 Tensor analysis 


A consequence of curvilinear frame vectors is that one has to be careful when 
working entirely in terms of coordinates, as is the case in tensor analysis. The 
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problem is that for a vector, for example, we have J = J’e;. If we just keep 
the coordinates J‘ we lose the information about the position dependence in 
the coordinate frame. When formulating the derivative of J in tensor analysis 
we must introduce connection coefficients to keep track of the derivatives of the 
frame vectors. This can often complicate derivations. 

There are two cases of the vector derivative in curvilinear coordinates that do 
not require connection coefficients. The first is the exterior derivative, for which 
we can write 


VAI = VA( Jie’) = (VI) Ae. (6.38) 


It follows that the exterior derivative has coordinates 0;J; — 0; J; regardless of 
chosen coordinate system. The second exception is provided by the divergence 
of a vector. We have 


V-J=V-(J*e). (6.39) 
If we define the volume factor V by 
e1 AeA Aen = IV, (6.40) 
where J is the unit pseudoscalar, we can write (following section 4.3) 
e; = (11) te" Ae” 1 A-- ABA: Aet IV. (6.41) 


Recalling that each of the ef vectors has vanishing exterior derivative, one can 
quickly establish that 
1 ð 


V-J=—~—(VJ"). 6.42 
V Bar | ) ( ) 
Similarly, the Laplacian V? can be written as 
1 ð .. Ob 
V?o = Vg 4 
? V ðr! ( j 55) , oe) 


where g% = eef. 


6.2.2 Orthogonal coordinates in three dimensions 


A number of the most useful coordinate systems are orthogonal systems of coor- 
dinates in three dimensions. For these systems a number of special results hold. 
We define a set of orthonormal vectors by first introducing the magnitudes 


hg = |e;| = (e¢-es)/?. (6.44) 
In terms of these we can write (no sums implied) 
; 1 
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We now use the {é;} as our coordinate frame and, since this frame is orthonormal, 
we can work entirely with lowered indices. For a vector J we have 


3 
J, 
J = Jii = Lej. 6.46 
It follows that we can write 


1 o o o 


A compact formula for the Laplacian is obtained by replacing each J; term with 


1/hj Oi, 
oe l ð (hyzhg ð$ ð (hgh ð$ 
¥ CS & hy Ox A xə ho Ox 
-ð (mh ð 
i 0x3 ( hg %)) i (678) 


The components of the curl can be found in a similar manner. A number of 


useful curvilinear coordinate systems are summarised below. 


Cartesian coordinates 


These are the basic starting point for all other coordinate systems. We introduce 
a constant, right-handed orthonormal frame {0;}, 710203 = I. This notation 
for a Cartesian frame is borrowed from quantum theory and is very useful in 
practice. The coordinates in the {o;} frame are written, following standard 
notation, as (x,y,z). To avoid confusion between the scalar coordinate x and 
the three-dimensional position vector we write the latter as r. That is, 


T = £0, + yor + 203. (6.49) 


Since the frame vectors are orthonormal we have hı = hə = h3 = 1, so the 
divergence and Laplacian take on their simplest forms. 


Cylindrical polar coordinates 


These are denoted (p,¢,z) with p and ¢ the standard two-dimensional polar 
coordinates 
yi? y 


p= (a +y? 
T 


,  tanġ= (6.50) 


The coordinates lie in the ranges 0 < r < œ and 0 < @ < 2a. The coordinate 
vectors are 

êp = cos(¢) o1 + sin(¢) oe, 

êp = — sin(ġ) o1 + cos(¢) ae, (6.51) 


e; = 03. 
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We have adopted the common convention of labelling the frame vectors with the 
associated coordinate. The magnitudes are hp = 1, hg = p and h, = 1, and the 
frame vectors satisfy 


êpêgêz = 010203 = I (6.52) 


and so form a right-handed set in the order (p, ¢, z). 


Spherical polar coordinates 


Spherical polar coordinates arise in many problems in physics, particularly quan- 
tum mechanics and field theory. They are typically labelled (r,@,¢) and are 
defined by 


r= |r| = (r-r), rcos(@) =z, tan(¢) = =. (6.53) 


Sie 


The coordinate ranges are 0 < r < w,0<6< 7 and0< ¢< 2r. The 
@ coordinate is ill defined along the z axis — a reflection of the fact that it is 
impossible to construct a global coordinate system over the surface of a sphere. 
The inverse relation giving r(r,0,ġ) is often useful, 


r =rsin(@)(cos(¢) a1 +sin(¢) o2) + rcos(6) o3. (6.54) 
This expression makes it a straightforward exercise to compute the orthonormal 
frame vectors, which are 
ê, = sin(@)(cos(¢) o1 + sin(¢) 72) + cos(9) 03 = rtr, 
êg = cos(@)(cos(d) a1 + sin(¢) o2) — sin (0) o3, (6.55) 
êp = —sin(¢) 01 + cos(¢) o2. 
The associated normalisation factors are 
hy=1, he=r, he =rsin(6). (6.56) 


The orthonormal vectors satisfy é,é9¢4 = I so that {ê,, êg, êg} form a right- 
handed orthonormal frame. This frame can be obtained from the {e;} frame 
through the application of a position-dependent rotor, so that ês = Ro3R, 
ê = Roi R and êp = RoR. The rotor is then given by 


R = exp(—Io3/2) exp(—Ia26/2). (6.57) 


Spheroidal coordinates 


These coordinates turn out to be useful in a number of problems in gravitation 
and electromagnetism involving rotating sources. We introduce a vector a, so 
that +a denote the foci of a family of ellipses. The distances from the foci are 
given by 


r, = |r+al, rg = |r —al. (6.58) 
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From these we define the orthogonal coordinates 
u= $(r1 +72), v= tfr — rə). (6.59) 


The coordinate system is completed by rotating the ellipses around the a axis. 
This defines an oblate spheroidal coordinate system. Prolate spheroidal coordi- 
nates are formed by starting in a plane, defining (u1, u2) as above, and rotating 
this system around the minor axis. 


If we define 
penn a= A, (6.60) 
we see that 
e” = }(ĉ1 +°2), e” = i(i — ĉo), (6.61) 


which are clearly orthogonal. The normalisation factors are found from 


wv? u? —v? 
eae. ee 


h2 = (6.62) 


a 
—v 
If we align a with the 3 axis and let ¢ take its spherical-polar meaning, the 
coordinate frame is completed with the vector ég, and 


h3 = (u? — a”) (a? — v’). (6.63) 


The frame vectors satisfy €,€4¢, = I. The hyperbolic nature of the coordinate 
system is often best expressed by redefining the u and v coordinates as a cosh(w) 
and acos(#) respectively. 


6.3 Analytic functions 


The vector derivative combines the algebraic properties of geometric algebra with 
vector calculus in a simple and natural way. In this section we show how the 
vector derivative can be used to extend the definition of an analytic function 
to arbitrary dimensions. We start by considering the vector derivative in two 
dimensions to establish the link with complex analysis. 


6.3.1 Analytic functions in two dimensions 


Suppose that {e1, e2} define an orthonormal frame in two dimensions. This is 
identified with the Argand plane by singling out e; as the real axis. We denote 
coordinates by (x,y) and write the position vector as r: 


r= re, + yez. (6.64) 
With this notation the vector derivative is 
o o 
V = e — — 6.65 
ey dx + €z By ( ) 
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In section 2.3.3 we showed that complex numbers sit naturally within the geo- 
metric algebra of the plane. The pseudoscalar is the bivector I = e,e2, which 
satisfies J? = —1. Complex numbers therefore map directly onto even-grade 
elements in the algebra by identifying the unit imaginary 7 with J. The position 
vector r is mapped onto a complex number by pre-multiplying by the vector 
representing the real axis: 


z=a+Iy=eqr. (6.66) 


Now suppose we introduce the complex field y = u + Iv. The vector derivative 


applied to w yields 
ðu Ov ðv Ou 
Vw = (> | ei 4 (= j os) eo. (6.67) 


The terms in brackets are precisely the ones that vanish in the Cauchy—Riemann 
equations. The statement that w is an analytic function (a function that satisfies 
the Cauchy—Riemann equations) reduces to the equation 


Vy =0. (6.68) 


This is the fundamental equation which can be generalised immediately to higher 
dimensions. These generalisations invariably turn out to be of mathematical and 
physical importance, and it is is no exaggeration to say that equations of the 
type of equation (6.68) are amongst the most studied in physics. 

To complete the link with complex analysis we recall that the complex partial 
derivative ð, is defined by the properties 


Oz Azt 

E ri < =0 6.69 

Oz Oz (6-98) 
with the complex conjugate satisfying 

Oz Ozt 

at = JTL (6.70) 


From these we see that 


ðo 1/20 o o 1/0 o 
Oz 2 & 1). Ot 2 (= x) or 


An analytic function is one that depends on z alone. That is, we can write 
p(x + Iy) = ¥(z). The function is therefore independent of zt, and we have 
Or (z) 
Ozt 
This summarises the content of the Cauchy—Riemann equations, though this fact 


=0. (6.72) 


is often obscured by the complex limiting argument favoured in many textbooks. 
Comparing the preceding forms, we see that this equation is equivalent to 


1/8 Ə i 7 
: (z +15.) Y= he Vy =0, (6.73) 
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recovering our earlier equation. 
It is instructive to see why solutions to Vw = 0 can be constructed as power 
series in z. We first see that 


Vz = V (er) = 2e1 -Vr — e Vr = 2e; — 2e; = 0. (6.74) 


This little manipulation drives most of analytic function theory! It follows im- 
mediately, for example, that 


V (z — z0)” = nV (er — zo)(z — 20)” ' = 0, (6.75) 


so a Taylor series expansion in z about z automatically returns an analytic 
function. We will delay looking at poles until we have introduced the subject of 
directed integration. 


6.3.2 Generalized analytic functions 


There are two problems with the standard presentation of complex analytic 
function theory that prevent a natural generalisation to higher dimensions: 


(i) Both the vector operator V and the functions it operates on are mapped 
into the same algebra by picking out a preferred direction for the real 
axis. This only works in two dimensions. 

(ii) The ‘complex limit’ argument does not generalise to higher dimensions. 
Indeed, one can argue that it is not wholly satisfactory in two dimensions, 
as it confuses the concept of a directional derivative with the concept of 
being independent of 2’. 


These problems are solved by keeping the derivative operator V as a vector, 
while letting it act on general multivectors. The analytic requirement is then 
replaced with the equation Vw = 0. Functions satisfying this equation are said 
to be monogenic. If ù contains all grades it is clear that both the even-grade 
and odd-grade components must satisfy this equation independently. Without 
loss of generality, we can therefore assume that w has even grade. 

We can construct monogenic functions by following the route which led to the 
conclusion that z is analytic in two dimensions. We recall that Vr = 3 and 


V(ar) = —a. (6.76) 
It follows that 
py = ra + 3ar (6.77) 


is a monogenic for any constant vector a. The main difference with complex 
analysis is that we cannot derive new monogenics simply from power series in 
this solution, due to the lack of commutativity. One can construct monogenic 
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functions from series of geometric products, but a more instructive route is to 
classify monogenics via their angular properties. 

First we assume that V is a monogenic containing terms which scale uniformly 
with r. If we introduce polar coordinates we can then write 


U(r) = r'y(0, ¢). (6.78) 
The function (6, ġ) then satisfies 
Ir’ ten +r'Vu(0, p) = 0. (6.79) 
It follows that w satisfies the angular eigenvalue equation 
—rAVy = ly. (6.80) 


These angular eigenstates play a key role in the Pauli and Dirac theories of the 
electron. Since Y satisfies VY = 0, it follows that 


V°v =0. (6.81) 


So each component of ẸŲ (in a constant basis) satisfies Laplace’s equation. It 
follows that each component of w is a spherical harmonic, and hence that l is an 
integer. We can construct a monogenic by starting with the function (x+ylo3)!, 
which is the three-dimensional extension of the complex analytic function z’. In 
terms of polar coordinates 


(£ + ylo3)' = r! sin! (0) ef 9/73, (6.82) 
which gives us our first angular monogenic function 
yl = sin! (0) e98, (6.83) 


The remaining monogenic functions are constructed from this by acting with an 
operator which, in quantum terms, lowers the eigenvalue of the angular momen- 
tum around the z axis. These are discussed in more detail in section 8.4.1. 


6.3.3 The spacetime vector derivative 


To construct the vector derivative in spacetime suppose that we introduce the 
orthonormal frame {7,,} with associated coordinates x”. We can then write 


o o 20 


This derivative is the key operator in all relativistic field theories, including 
electromagnetism and Dirac theory. If we post-multiply by yo we see that 


Vo = & + ð; = & — V, (6.85) 
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where V = 0,0; is the vector derivative in the relative space defined by the yo 
vector. Similarly, 


PV = + V. (6.86) 
These equations are consistent with 
Va = V (yx) = (3 — V)(t— r) = 4, (6.87) 


where x is the spacetime position vector. The spacetime vector derivative satis- 
fies 


-V?, (6.88) 


which is the fundamental operator describing waves travelling at the speed of 
light. The spacetime monogenic equation Vw = 0 is discussed in detail in chap- 
ters 7 and 8. We only note here that, if w is an even-grade element of the 
spacetime algebra, the monogenic equation is precisely the wave equation for a 
massless spin-1/2 particle. 


6.3.4 Characteristic surfaces and propagation 


The fact that V? can give rise to either elliptic or hyperbolic operators, depending 
on signature, suggests that the propagator theory for V will depend strongly on 
the signature. This is confirmed by a simple argument which can be modified 
to apply to most first-order differential equations. Suppose we have a generic 
equation of the type 


V4 = flv, 2), (6.89) 


where w is some multivector field, f(w,x) is a known function and x is the 
position vector in an n-dimensional space. We are presented with data on some 
(n — 1)-dimensional surface, and wish to propagate these initial conditions away 
from the surface. If surfaces exist for which this is not possible they are known as 
characteristic surfaces. Suppose that we construct a set of independent tangent 
vectors in the surface, {e1,...,@€n—1}. Knowledge of y on the surface enables us 
to calculate each of the directional derivatives e;-Vw, i = 1,...,n — 1. We now 
form the normal vector 


n = TeyAegA:+-A€n—1, (6.90) 


where J is the pseudoscalar for the space. Pre-multiplying equation (6.89) with 
n we obtain 


n:-Vy = —nAVy + nf (y, x). (6.91) 
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But we have 


pe ee eae no 
TE 1)iH (e1 A AČA Aen) ei VY, (6.92) 


which is constructed entirely from known derivatives of 7. Equation (6.91) then 
tells us how to propagate % in the n direction. The only situation in which we 
can fail to propagate w is when n still lies in the surface. This happens if n is 
linearly dependent on the surface tangent vectors. If this is the case we have 


n(e1AegA-+-A€n—1) = 0. (6.93) 
But this implies that 
(I7'n)An = I7'n-n = 0. (6.94) 


We therefore only fail to propagate when n? = 0, so characteristic surfaces are al- 
ways null surfaces. This possibility can only arise in mixed signature spaces, and 
unsurprisingly the propagators in these spaces can have quite different properties 
to their Euclidean counterparts. 


6.4 Directed integration theory 


The true power of geometric calculus begins to emerge when we study directed 
integration theory. This provides a very general and powerful integral theorem 
which enables us to construct Green’s functions for the vector derivative in var- 
ious spaces. These in turn can be used to generalise the many powerful results 
from complex function theory to arbitrary spaces. 


6.4.1 Line integrals 


The simplest integrals to start with are line integrals. The line integral of a 
multivector field F(x) along a line x(A) is defined by 


dx i 
[ros dà = [Fa = sim Y F’ Ag’, (6.95) 
In the final expression a set of successive points along the curve {x;} are intro- 
duced, with zo and £n the endpoints, and 
Azt =2,-%-1, F = $(F(ai-1) + F(2)). (6.96) 


If the curve is closed then xj = £n. The result of the integral is independent 
of the way we choose to parameterise the curve, provided the parameterisation 
respects the required ordering of points along the curve. Curves that double back 
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on themselves are handled by referring to the parameterised form 2x(A), which 
tells us how the curve is traversed. 

The definition of the integral (6.95) looks so standard that it is easy to overlook 
the key new feature, which is that dz is a vector-valued measure, and the product 
F dz is a geometric product between multivectors. This small extension to scalar 
integration is sufficient to bring a wealth of new features. We refer to dx, and 
its multivector-valued extensions, as a directed measure. The fact that dx is no 
longer a scalar means that equation (6.95) is not the most general line integral 
we can form. We can also consider integrals of the form 


/ F(x) O(a roe [Fo ddala (6.97) 


and more generally we can consider sums of terms like these. The most general 
form of line integral can be written 


/ Oude / L(de), (6.98) 


where L(a) = L(a; x) is a multivector-valued linear function of a. The position 
dependence in L can often be suppressed to streamline the notation. 
Suppose now that the field F is replaced by the vector-valued function v(x). 


We have 
vae= fv-de+ f onde, (6.99) 


which separates the directed integral into scalar and bivector-valued terms. If 
v is the unit tangent vector along the curve then the scalar integral returns the 
arc length. In many applications the scalar and bivector integrals are considered 
separately. But to take advantage of the most powerful integral theorems in 
geometric calculus we need to use the combined form, containing a geometric 
product with the directed measure. 


6.4.2 Surface integrals 


The natural extension of a line integral is to a directed surface integral. Suppose 
now that the the multivector-valued field F is defined over a two-dimensional 
surface embedded in some larger space. If the surface is parameterised by two 
coordinates x(x, x?) we define the directed measure by the bivector 

Ox Ox 


dX = manga dz? = e1 ^ez dz! dx’, (6.100) 


where e; = ix. This measure is independent of how the surface is parameterised, 
provided we orient the coordinate vectors in the desired order. Sometimes more 
than one coordinate patch will be needed to parameterise the entire surface, but 
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Figure 6.2 A triangulated surface. The surface is represented by a series of 
points, and each set of three adjacent points defines a triangle, or simplex. 
As more points are added the simplices become a closer fit to the true 
surface. Each simplex is given the same orientation by ensuring that for 
adjacent simplices, the common edge in traversed in opposite directions. 


the directed measure dX is still defined everywhere. A directed surface integral 
then takes the form 


pre = [Panes dx! dx”, (6.101) 


or a sum of such terms if more than one coordinate patch is required. Again, we 
form the geometric product between the integrand and the measure. As in the 
case of a line integral, this is not the most general surface integral that can be 
considered, as the integrand can multiply the measure from the left or the right, 
giving rise to different integrals. 

As an example of a surface integral, consider a closed surface in three dimen- 
sions, with unit outward normal n. We let F be given by the bivector-valued 
function ¢nI—!, where ¢ is a scalar field. The surface integral is then 


oni dX = $ olas. (6.102) 


Here |dS| = I~!n dX is the scalar-valued measure over the surface. The directed 
measure is usually chosen so that n dX has the same orientation as J. As a second 
example, suppose that F = 1. In this case we can show that 


fax =0, (6.103) 


which holds for any closed surface (see later). If the surface is open, the result 
of the directed surface integral depends entirely on the boundary, since all the 
internal simplices cancel out. This result is sometimes called the vector area, 
though in geometric algebra the result is a bivector. 

In order to construct proofs of some of the more important results it is nec- 
essary to express the surface integral (6.101) in terms of a limit of a sum. This 
involves the idea of a triangulated surface (figure 6.2). A set of points are chosen 
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T2 


To ei Tı 


Figure 6.3 A planar simplex. The points £o, £1, £2 define a triangle. The 
order specifies how the boundary is traversed, which defines an orientation 
for the simplex. 


on the surface, and adjacent sets of three points define a series of planar trian- 
gles, or simplices. As more points are added these triangles become smaller and 
are an ever better model for the surface. (In computer graphics programs this 
is precisely how ‘smooth’ surfaces are represented internally.) Each simplex has 
an orientation attached such that, for a pair of adjacent simplices, the common 
edge is traversed in opposite directions. In this way an initial simplex builds 
up to define an orientation for the entire surface. For some surfaces, such as 
the Mobius strip, it is not possible to define a consistent orientation over the 
entire surface. For these it is not possible to define a directed integral, so our 
presentation is restricted to orientable surfaces. 

Suppose now that the three points xo, £1, £2 define the corners of a simplex, 
with orientation specified by traversing the edges in the order £o > £1 +> T2 
(see figure 6.3). We define the vectors 


€y = X— T0, €2 = T2 — To. (6.104) 
The surface measure is then defined by 
AX = seep = (ay Ar2 + xgAaq + T0A^11). (6.105) 


AX has the orientation defined by the boundary, and an area equal to that of 
the simplex. The final expression makes it clear that AX is invariant under even 
permutations of the vertices. With this definition of AX we can express the 
surface integral (6.101) as the limit: 


NCO 


[rex = lim X BAX (6.106) 
k=1 


The sum here runs over all simplices making up the surface, and for each simplex 
F is the average value of F over the simplex. For well-behaved integrals the value 
in the limit is independent of the precise nature of the limiting process. 
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6.4.3 n-dimensional surfaces 


The simplex structure introduced in the previous section provides a means of 
defining a directed integral for any dimension of surface. We discretise the surface 
by considering a series of points, and adjacent sets of points are combined to 
define a simplex. Suppose that we have an n-dimensional surface, and that 
one simplex for the discretised surface has vertices x%9,...,% , with the order 
specifying the desired orientation. For this simplex we define vectors 


ei = Zi — £o, t=1,...,n, (6.107) 


and the directed volume element is 


1 
AX = no Nen. (6.108) 
A point in the simplex can be described in terms of coordinates Àt, ..., A” by 
writing 
c=aot+ ~ Ne. (6.109) 
i=1 


Each coordinate lies in the range 0 < \’ < 1, and the coordinates also satisfy 
XOX <1. (6.110) 
i=1 


Now suppose we have a multivector field F(a) defined over the surface. We 
denote the value at each vertex by F; = F(a;). A new function f(x) is then 
introduced which linearly interpolates the F; over the simplex. This can be 
written 


f(a) = Fo+ DN (Fi - Fo), (6.111) 


As the number of points increases and the simplices grow smaller, f(x) becomes 
an ever better approximation to F(x), and the triangulated surface approaches 
the true surface. 

The directed integral of F over the surface is now approximated by the integral 
of f over each simplex in the surface. To evaluate the integral over each simplex 
we use the àf as coordinates, so that 


dX = €,A-++Aen dÀ! +++ dX”. (6.112) 


It is then a straightforward exercise in integration to establish that 


fax =AX (6.113) 
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and 


pr dX = LAE vA’. (6.114) 
n+1 


Combining these two results we find that the integral of f(a) over a single simplex 
evaluates to 


[rex = + (>: n) AX. (6.115) 
i=0 


The function is therefore replaced by its average value over the simplex. We 
write this as F. Summing over all the simplices making up the surface we can 
now define 


frax = lim X FY AX*, (6.116) 
k=1 


where k runs over all of the simplices in the surface. More generally, suppose 
that L(An) is a position-dependent linear function of a grade-n multivector An. 
We can then write 


fue = tim L(A X*), (6.117) 
k=1 


with L*(AX*) the average value of L(AX*) over the vertices of each simplex. 


6.4.4 The fundamental theorem of geometric calculus 


Most physicists are familiar with a number of integral theorems, including the 
divergence and Stokes’ theorems, and the Cauchy integral formula of complex 
analysis. We will now show that these are all special cases of a more general 
theorem in geometric calculus. In this section we will sketch of proof of this 
important theorem. Readers who are not interested in the details of the proof 
may want to jump straight to the following section, where some applications 
are discussed. The proof given here uses simplices and triangulated surfaces, 
which means that it is relevant to methods of discretising integrals for numerical 
computation. 

We start by introducing a notation for simplices which helps clarify the nature 
of the boundary operator. We let (£o, £1,..., £k) denote the k-simplex defined 
by the k +1 points xo,...,2~. This is abbreviated to 


(x)(e) = (£0, %1,---, £k). (6.118) 


The order of points is important, as it specifies the orientation of the simplex. 
If any two adjacent points are swapped then the simplex changes sign. The 
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boundary operator for a simplex is denoted by ô and is defined by 


(1) Go -5 Ži -3 te (1) (6.119) 


= 
È 
= 
Il 
M- 


where the check denotes that the term is missing from the product. So, for 
example, 


O(xo, £1) = (z1) — (Xo), (6.120) 
which returns the two points at the end of a line segment. The boundary of a 
boundary vanishes, 
00(2) (x) = 0. (6.121) 
Proofs of this can be found in most differential geometry textbooks. 
So far we have dealt only with ordered lists of points, not geometric sums or 


products. To add some geometry we introduce the operator A which returns the 
directed content of a simplex, 


A(z) (K) = a — #9) A (T2 — T0)^- -A (£k — T0). (6.122) 


This is the result of integrating the directed measure over a simplex 


J dX = A(£)k) = AX. (6.123) 
(£) (k) 


The directed content of a boundary vanishes, 
A(ð(z)&)) = 0. (6.124) 
As an example, consider a planar simplex consisting of three points. We have 
a(o, £1, £2) = (£1, £2) — (£0, £2) + (z0, 21). (6.125) 
So the directed content of the boundary is 
A(O(0, z1, £2)) = (£2 — x1) — (£2 — z0) + (41 — Xo) = 0. (6.126) 


The general result of equation (6.124) can be established by induction from the 
case of a triangle. These results are sufficient to establish that the directed 
integral over the surface of a simplex is zero: 

k 


dS =X (-1) I dX = A(ð(x)&)) = 0. (6.127) 


i=0 


a(s) ck) (%i) (1) 


A general volume is built up from a chain of simplices. Simplices in the 
chain are defined such that, at any common boundary, the directed areas of 
the bounding faces of two simplices are equal and opposite. It follows that the 
surface integrals over two simplices cancel out over their common face. The 
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surface integral over the boundary of the volume can therefore be replaced by 
the sum of the surface integrals over each simplex in the chain. If the boundary 
is closed we establish that 


fas = lim > fas =0. (6.128) 
a=1 


The sum runs over each simplex in the surface, with a labeling the simplex. It 
is implicit in this proof that the surface bounds a volume which can be filled by 
a connected set of simplices. So, as well as being oriented, the surface must be 
closed and simply connected. 

Next, we return to equation (6.114) and introduce a constant vector b. If we 
define b; = b-e; we see that 


k 
XC bi! = b- (x — 20), (6.129) 
i=1 


which is valid for all vectors x in the simplex of interest. Multiplying equa- 
tion (6.114) by b; and summing over i we obtain 


k 
1 
J kenas pa AX, (6.130) 
(£) (k) T 


where the integral runs over a simplex defined by k + 1 vertices. A simple re- 
ordering yields 


k 
1 
=b-7AX, (6.131) 


where 7 is the vector representing the (geometric) centre of the simplex, 


k 
1 
== —_ i .132 
Pog gy 2 (6.132) 
Now suppose we have a k-simplex specified by the k + 1 points (xo,..., £k) 
and we form the directed surface integral of b-x. We obtain 
to eee 
f badS = => S(-1)'b: (x0 +++ Bi + En)A (ži) (6.133) 


+14 
alx) k) =R 


To evaluate the final sum we need the result that 


k 


, 1 
XO(=1)tb- (ao + ži + En) A (ži) -1 = pio (e1 Aen). (6.134) 
i=0 ` 
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The proof of this result is purely algebraic and is left as an exercise. We have 
now established the simple result that 


f b-adS =b- (AX), (6.135) 
a(x) k) 
where AX = A((x)(,)). The order and orientations in this result are important. 
The simplex (x)(x) is oriented, and the order of points specifies how the boundary 
is traversed. With dS the oriented element over each boundary, and AX the 
volume element for the simplex, we find that the correct expression for the surface 
integral is b (AX). 

We are now in a position to apply these results to the interpolated function 
f(x) of equation (6.111). Suppose that we are working in a (flat) n-dimensional 
space and consider a simplex with points (xo,...,2%). The simplex is chosen 
such that its volume is non-zero, so the n vectors e; = x; — £o define a (non- 
orthonormal) frame. We therefore write 


ej = Ti — Lo, (6.136) 
and introduce the reciprocal frame {ef}. These vectors satisfy 
e. (x — zo) =X’. (6.137) 


It follows that the surface integral of f(x) over the simplex is given by 


= 5R, — Fo)e’-(AX). (6.138) 


= F; — Fy. (6.139) 


The result of the surface integral can therefore be written 


f fa)dS = Y \(F; — Fo)e'-(AX) 


w=1 


“a 


Here we have used the result that V = e’0;, which follows from using the AŻ as 
a set of coordinates. 
We now consider a chain of simplices, and add the result of equation (6.140) 


O(a) (k) 


= fV-(AX). (6.140) 
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over each simplex in the chain. The interpolated function f(x) takes on the same 
value over the common boundary of two adjacent simplices, since f(a) is only 
defined by the values at the common vertices. In forming a sum over a chain, 
all of the internal faces cancel and only the surface integral over the boundary 
remains. We therefore arrive at 


f re) dS =X fV- (AX°), (6.141) 


with the sum running over all of the simplices in the chain. Taking the limit as 
more points are added and each simplex is shrunk in size we arrive at our first 
statement of the fundamental theorem, 


f FdS= | FVdx. (6.142) 
OV V 


We have replaced the interpolated function f with F, which is obtained in the 
limit as more points are added. We have also used the fact that V lies en- 
tirely within the space defined by the pseudoscalar measure dX to remove the 
contraction on the right-hand side and write a geometric product. 

The above proof is easily adapted for the case where the function sits to the 
right of the measure, giving 


f dSG= | VdXG. (6.143) 
OV V 

Since V is a vector, the commutation properties with dX will depend on the 
dimension of the space. A yet more general statement of the fundamental theo- 
rem can be constructed by introducing a linear function L(An—1) = L(An-1; £). 
This function takes a multivector An—ı of grade n — 1 as its linear argument, 
and returns a general multivector. L is also position-dependent, and its linear 
interpolation over a simplex is defined by 


L(A) = L(A; £o) + >D dé (L(A; x:) — L(A, £o)). (6.144) 


The linearity of L(A) means that sums and integrals can be moved inside the 
argument, and we establish that 


f ras) = L ( f asi) +t (fasia) -ZL (f xas:zo) 


L(e’ AX; x;) — L(e’ AX; zo) 


II 


=l 


= i(VAX). (6.145) 


There is no position dependence in the final term as the derivative is constant 
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over the simplex. Building up a chain of simplices and taking the limit we prove 
the general result 


f L(dS) = | L(VdX). (6.146) 
ov V 


This holds for any linear function L(An—1) integrated over a closed region of 
an n-dimensional flat space. This is still not the most general statement of the 
fundamental theorem, as we will later prove a version valid for surfaces embedded 
in a curved space, but equation (6.146) is sufficient to make contact with the 
main integral theorems of vector calculus. 


6.4.5 The divergence and Green’s theorems 


To see the fundamental theorem of geometric calculus in practice, first consider 
the scalar-valued function 


L(A) = (JAT '). (6.147) 


Here J is a vector, and J is the (constant) unit pseudoscalar for the n-dimensional 
space. The argument A is a multivector of grade n — 1. Equation (6.146) gives 


f axr )= f vslax|= $ (asr \, (6.148) 


where |dX| = I~1'dX is the scalar measure over the volume of interest. The 
normal to the surface, n is defined by 


n\dS| = dS I~", (6.149) 


where |dS| is the scalar-valued measure over the surface. This definition ensures 
that, in Euclidean spaces, n dS has the orientation defined by J, and in turn that 
n points outwards. With this definition we arrive at 


[vraxi= g n-J lds], (6.150) 
V OV 


which is the familiar divergence theorem. This way of writing the theorem hides 
the fact that n|dS| should be viewed as a single entity, which can be important 
in spaces of mixed signature. 

Now return to the fundamental theorem in the form of equation (6.143), and 
let G equal the vector J in two-dimensional Euclidean space. We find that 


f asJ= vaxj=- f VJadX, (6.151) 
OV V V 


where we have used the fact that dX is a pseudoscalar, so it anticommutes with 
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vectors in two dimensions. Introducing Cartesian coordinates we have dX = 
Idx dy, so 
dS J = -f VJI dz dy. (6.152) 
av V 


If we let J = Pe; + Qez and take the scalar part of both sides, we prove Green’s 
theorem in the plane 


f Paz + Qdy =f (3 & 2) dx dy. (6.153) 


The line integral is taken around the perimeter of the area in a positive sense, 
as specified by I = ej eg. 


6.4.6 Cauchy’s integral formula 


The fundamental theorem of geometric calculus enables us to view the Cauchy 
integral theorem of complex variable theory in a new light. We let ~ denote an 
even-grade multivector, which therefore commutes with dX, so we can write 


[veax = fasw = f Tya, (6.154) 


In the final expression A is a parameter along the (closed) curve. Now recall 
from section 6.3.1 that we form the complex number z by z = er. We therefore 
have 


f vaz = fave dX, (6.155) 


where the term on the left is now a complex line integral. The condition that Y% 

is analytic can be written Vw = 0 so we have immediately proved that the line 

integral of an analytic function around a closed curve always vanishes. 
Cauchy’s integral formula states that, for an analytic function, 


f=, ¢ 72 


Wri Jo 2-4 


dz, (6.156) 


where the contour C encloses the point a and is traversed in a positive sense. 
The precise form of the contour is irrelevant, because the difference between two 
contour integrals enclosing a is a contour integral around a region not enclosing 
a (see figure 6.4). In such a region f(z)/(z — a) is analytic so the difference has 
zero contribution. 

To understand Cauchy’s theorem in terms of geometric calculus we need to 
focus on the properties of the Cauchy kernel 1/(z — a). We first write 

1 (z-a) r-a 


z-a \(z — a)|? T (r-a) (6.15 


194 


6.4 DIRECTED INTEGRATION THEORY 


C2 


Figure 6.4 Contour integrals in the complex plane. The two contours C1 
and C2 can be deformed into one another, provided the function to be 
integrated has no singularities in the intervening region. In this case the 
difference vanishes, by Cauchy’s theorem. 


where a = ea is the vector corresponding to the complex number a. The 
essential quantity here is the vector (r — a)/(r — a)”, which we can write as 
r-a 
But ln |r—a]| is the Green’s function for the Laplacian operator in two dimensions, 
V? In |r — a| = 2rô(r — a). (6.159) 


It follows that the vector part of the Cauchy kernel satisfies 
r-a 
—— = 2rô(r — a). 6.160 
nae = ròle — a) (6.160) 
The Cauchy kernel is the Green’s function for the two-dimensional vector deriv- 
ative! The existence of this Green’s function proves that the vector derivative is 
invertible, which is not true of its separate divergence and curl components. 
The Cauchy integral formula now follows from the fundamental theorem of 
geometric calculus in the form of equation (6.155), 


CE TEN Jv (Searle) dX 


z—a (r 
=e, / (me — a)eı f(z) + vroce) I|dX| 
= 2nI f(a), (6.161) 


where we have assumed that f is analytic, V f(z) = 0. We can now understand 
precisely the roles of each term in the theorem: 
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(i) The dz encodes the tangent vector and forms a geometric product in the 
integrand. 
(ii) The (z — a)! is the Green’s function for the vector derivative V and 
ensures that the area integral only picks up the value at a. 
(iii) The I (which replaces i) comes from the directed volume element dX = 
Idx dy. 


Much of this is hidden in conventional accounts, but all of these insights are 
crucial to generalising the theorem. Indeed, we have already proved a more 
general theorem in two dimensions applying to non-analytic functions. For these 
we can now write, following section 6.3.1, 


ani f(a) = f "de 2 2R 


Oziz-—a 


I|dX|. (6.162) 


A second key ingredient in complex analysis is the series expansion of a func- 
tion. In particular, if f(z) is analytic apart from a pole of order n at z = a, the 
function has a Laurent series of the form 


f= aon ss a tX a(z- a)’. (6.163) 
(z-a) z-a Z 


The powerful residue theorem states that for such a function 


$ f(z) dz = 2ria_. (6.164) 
C 


We now have a new interpretation for the residue term in a Laurent expansion — 
it is a weighted Green’s function. The residue theorem just recovers the weight! 
Geometric calculus unifies the theory of poles and residues, supposedly unique 
to complex analysis, with that of Green’s functions and 6-functions. 

We now have an alternative picture of complex variable theory in terms of 
Green’s functions and surface data. Suppose, for example, that we start with a 
function f(x) on the real axis. We seek to propagate this function into the upper 
half-plane, subject to the boundary conditions that f falls to zero as |z| +> oo. 
The Cauchy formula tells us that we should propagate according to the formula 

oo 
f(a) = 23 Fle) dx. (6.165) 


271 Jing t-a 


But suppose now that we form the Fourier transform of the initial function f(z), 


f(x) = T dE Feike, (6.166) 


We now have 


oo OO Alea 
fla) = = f sf f °? de. (6.167) 
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Now we only close the x integral in the upper half-plane for positive k. For 
negative k there is no residue term, since a lies in the the upper half-plane. The 
Cauchy integral formula now returns 


f(a) = E dE Fath, (6.168) 


This shows that only the part of the function consistent with the desired bound- 
ary conditions is propagated in the positive y direction. The remaining part of 
the function propagates in the —y direction, if similar boundary conditions are 
imposed in the lower half plane. In this way the boundary conditions and the 
Green’s function between them specify precisely which parts of a function are 
propagated in the desired direction. No restrictions are placed on the boundary 
values f(x), which need not be part of an analytic function. 

A second example, which generalises nicely, is the unit circle. Suppose we have 
initial data f(0) defined over the unit circle. We write f (0) as 


f(0) = 5 fe, (6.169) 


The terms in exp(in@) are replaced by z” over the unit circle, and we then choose 
whether to evaluate in interior or exterior closure of the Cauchy integral. The 
result is that only the negative powers are propagated outwards from the circle, 
resulting in the function 


f= fae, elo (6.170) 
n=1 


(The constant component fo is technically propagated as well, but this can be 
removed trivially.) These observations are simple from the point of view of 
complex variable theory, but are considerably less obvious in propagator theory. 


6.4.7 Green’s functions in Euclidean spaces 


The extension of complex variable theory to arbitrary Euclidean spaces is now 
straightforward. The analogue of an analytic function is a multivector ~ sat- 
isfying Vw = 0. We choose to work with even-grade multivectors to simplify 
matters. The fundamental theorem states that 


f dSp= | VydX =0. (6.171) 
OV 


where we have used the fact that p commutes with the pseudoscalar measure 
dX. For any monogenic function y, the directed integral of 7 over a closed 
surface must vanish. 
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The Green’s function for the vector derivative in n dimensions is simpl 
ply 


1 £z— y 


G(x; y) = (6.172) 


Sn læ = y|”? 
where x and y are vectors and S» is the surface area of the unit ball in n- 
dimensional space. The Green’s function satisfies 


VG(a;y) = V-G(a;y) = d(x — y). (6.173) 


In order to allow for the lack of commutativity between G and w we use the 
fundamental theorem in the form 


f GdSy = | (GVb+GVu)dx 
OV V 
=| GVw dX, (6.174) 
V 


where we have used the fact that ~ is a monogenic function. Setting G equal 
to the Green’s function of equation (6.172) we find that Cauchy’s theorem in 
n dimensions can be written in the form 


by) = =e = 


Su(z). (6.175) 


ISn Jay |x — yl” 


This relates the value of a monogenic function at a point to the value of a surface 
integral over a region surrounding the point. 

One consequence of equation (6.175) is that a generalisation of Liouville’s 
theorem applies to monogenic functions in Euclidean spaces. We define the 
modulus function 


|M| = (MMtyi/2, (6.176) 
which is a well-defined positive-definite function for all multivectors M in a 


Euclidean algebra. The modulus function is easily shown to satisfy Schwarz 
inequality in the form 


|A+ B| < |A| + |B|. (6.177) 


If we let a denote a unit vector and let Vy denote the derivative with respect to 
the vector y we find that 


ae a(x — y)? +na-(x—y)(x—y) 
aVybly) = ISn Joy ja — y|r+? dS y(x). (6.178) 
It follows that 
1 
Ja Vw) g- = ald S| |ab(a)). (6.179) 


But if Y is bounded, |Y(x)| never exceeds some given value. Taking the surface 
of integration out to large radius r = |x|, we find that the right-hand side falls 
off as 1/r. This is sufficient to prove that the directional derivative of 4 must 
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vanish in all directions, and the only monogenic function that is bounded over 
all space is constant w. 

Equation (6.175) enables us to propagate a function off an initial surface in 
Euclidean space, subject to suitable boundary conditions. Suppose, for example, 
that we wish to propagate ~ off the surface of the unit ball, subject to the 
condition that the function falls to zero at large distance. Much like the two- 
dimensional case, we can write 

œœ 


p= 5 ap, (6.180) 


l=—oo 
where the yy are angular monogenics, satisfying 
LAVY = —ly. (6.181) 


Each angular monogenic is multiplied by r’ to yield a full monogenic function, 
and only the negative powers have their integral closed over the exterior region. 
The result is the function 


a aury r>1 (6.182) 
l=1 


Similarly, the positive powers are picked up if we solve the interior problem. 


6.4.8 Spacetime propagators 


Propagation in mixed signature spaces is somewhat different to the Euclidean 
case. There is no analogue of Liouville’s theorem to call on, so one can easily 
construct bounded solutions to the monogenic equation which are non-singular 
over all space. Plane wave solutions to the massless Dirac equation are an ex- 
ample of such functions. Furthermore, the existence of characteristic surfaces 
has implications for the how boundary values are specified. To see this, consider 
a two-dimensional Lorentzian space with basis vectors {yo, 71}, Yo = —77 = 1, 
and pseudoscalar I = 7170. The monogenic equation is Vw = 0, where w is an 
even-grade multivector built from a scalar and pseudoscalar terms. We define 
the null vectors 


nt =EN. (6.183) 


Pre-multiplying the monogenic equation by n, we find that 
no Vy = —n AVY = I (nD Ve = —In,-Vv. (6.184) 
where we have used the result that Ing =n +. It follows that 
(1+ D)n,-Vy =0, (6.185) 
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and similarly, 


(1 —I)n_-Vp =0. (6.186) 


If we take w and decompose it into Y% = Y4 + y_, 


ve = 414 D)y, (6.187) 


we see that the values of the separate w 
along the respective null vectors n+. Propagation of w from an initial surface 
is therefore quite straightforward. The function is split into Y+, and the values 
of these are transported along the respective null vectors. That is, Y+} has the 
same value along each vector in the n, direction, and the same for w~_. There 
is no need for a complicated contour integral. 

The fact that the values of y are carried along the characteristics illustrates a 
key point. Any surface on which initial values are specified can cut a character- 


+- components have vanishing derivatives 


istic surface only once. Otherwise the initial values are unlikely to be consistent 
with the differential equation. For the monogenic equation, Vw = 0, suitable 
initial conditions consist of specifying ~ along the 7, axis, for example. But 
the fundamental theorem involves integrals around closed loops. The theorem 
is still valid in a Lorentzian space, so it is interesting to see what happens to 
the boundary data if we attempt to construct an interior solution with arbitrary 
surface data. The first step is to construct the Lorentzian Green’s function. This 
can be found routinely via its Fourier transformation. With z = 2°y + x171 we 
find 


Qn 2m w2— k2 


_i dwdk (yotv , 7-1 elkart — wx?) 
2 2T 20 w—-k  w+k 


a= if dw dk wyo + kyı oilke! — wa?) 


= (5(2* — x°) (yo +71) + Slet + 2°) (40 — 71))- (6.188) 


The function e(x?) takes the value +1 or —1, depending on whether 2° is positive 
or negative respectively. 

To apply the fundamental theorem, suppose we take the contour of figure 6.5, 
which runs along the 7 axis for two different times t; < ty and is closed at 
spatial infinity. We assume that the function we are propagating, w, falls off at 
large spatial distance, and write y(x) as Y(x?, x1). The fundamental theorem 
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ts 


ti 


Figure 6.5 A spacetime contour. The contour is closed at spatial infinity. 


then gives 


wy) =1f dA G(tiyo + AN — YN (tis A) 


—Co 


= a dA G(t eyo + AN — y) Yltr, A) 
1 
= HD (vty! — y’ + ti) + oltri y — 9 +te)) 


a i — 1) (ylti =y! +y? + ti) + vty, y +y? + t,)). (6.189) 


The construction of p(y) in the interior region has a simple interpretation. For 
the function 7%+(y), for example, we form the null vector ny through y. The 
value at y is then the average value at the two intersections with the boundary. 
A similar construction holds for ~_. Much like the Euclidean case, only the part 
of the function on the boundary that is consistent with the monogenic equation 
is propagated to the interior. 

These insights hold in other Lorentzian spaces, such as four-dimensional space- 
time. The Green’s functions become more complicated, and typically involve 
derivatives of 6-functions. These are more usefully handled via their Fourier 
transforms, and are discussed in more detail in section 8.5. In addition, the lack 
of a Liouville’s theorem means that any monogenic function can be added to a 
Green’s function to generate a new Green’s function. This has no consequences 
if one rigorously applies surface integral formulae. In quantum theory, however, 
this is not usually the case. Rather than a rigorous application of the generalised 
Green’s theorem, it is common instead to talk about propagators which transfer 
initial data from one timeslice to a later one. Used in this role, the Green’s func- 
tions we have derived are referred to as propagators. As we are not specifying 
data over a closed surface, adding further terms to our Green’s function can have 
an effect. These effects are related to the desired boundary conditions and are 
crucial to the formulation of a relativistic quantum field theory. There one is led 
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to employ the complex-valued Feynman propagator, which ensures that positive 
frequency modes are propagated forwards in time, and negative frequency modes 
are propagated backwards in time. We will meet this object in greater detail in 
section 8.5. 


6.5 Embedded surfaces and vector manifolds 


We now seek a generalisation of the preceding results where the volume integral is 
taken over a curved surface. We will do this in the setting of the vector manifold 
theory developed by Hestenes and Sobczyk (1984). The essential concept is to 
treat a manifold as a surface embedded in a larger, flat space. Points in the 
manifold are then treated as vectors, which simplifies a number of derivations. 
Furthermore, we can exploit the coordinate freedom of geometric algebra to 
derive a set of general results without ever needing to specify the dimension 
of the background space. The price we pay for this approach is that we are 
working with a more restrictive concept of a manifold than is usually the case 
in mathematics. For a start, the surface naturally inherits a metric from the 
embedding space, so we are already restricting to Riemannian manifolds. We will 
also insist that a pseudoscalar can be uniquely defined throughout the surface, 
making it orientable. 

While this may all appear quite restrictive, in fact these criteria rule out hardly 
any structures of interest in physics. This approach enables us to quickly prove 
a number of key results in Riemannian geometry, and to unite these with results 
for the exterior geometry of the manifold, achieving a richer general theory. We 
are not prevented from discussing topological features of surfaces either. Rather 
than build up a theory of topology which makes no reference to the metric, 
we instead build up results that are unaffected if the embedding is (smoothly) 
transformed. 

We define a vector manifold as a set of points labelled by vectors lying in a 
geometric algebra of arbitrary dimension and signature. If we consider a path in 
the surface x(A), the tangent vector is defined in the obvious way by 


aa 1 = lim z : (6.190) 
An advantage of the embedding picture is that the meaning of the limit is well 
defined, since the numerator exists for all e. This is true even if, for finite epsilon, 
the difference vector does not lie entirely in the tangent space and only becomes a 
tangent vector in the limit. Standard formulations of differential geometry avoid 
any mention of an embedding, however, so have to resort to a more abstract 
definition of a tangent vector. 
An immediate consequence of this approach is that we can define the path 


202 


6.5 EMBEDDED SURFACES AND VECTOR MANIFOLDS 


length as 


A2 
s= |a’-a’|/? dd. (6.191) 

Ai 
The embedded surface therefore inherits a metric from the ‘ambient’ background 
space. All finite-dimensional Riemannian manifolds can be studied in this way 
since, given a manifold, a natural embedding in a larger flat space can always be 
found. In applications such as general relativity one is usually not interested in 
the properties of the embedding, since they are physically unmeasurable. But in 
many other applications, particularly those involving constrained systems, the 
embedding arises naturally and useful information is contained in the extrinsic 
geometry of a manifold. 


6.5.1 The pseudoscalar and projection 


Suppose that we next introduce a set of paths in the surface all passing through 
the same point x. The paths define a set of tangent vectors {e1,...,@n}. We as- 
sume that these are independent, so that they form a basis for the n-dimensional 
tangent space at the point x. The exterior product of the tangent vectors defines 
the pseudoscalar for the tangent space I(x): 


I(x) = e,AegA---Aen/lerAegA: + Aen]. (6.192) 


The modulus in the denominator is taken as a positive number, so that J has 
the orientation specified by the tangent vectors. The pseudoscalar will satisfy 


IP = +1 


(6.193) 


with the sign depending on dimension and signature. Clearly, to define J in 
this manner requires that the denominator in (6.192) is non-zero. This provides 
a restriction on the vector manifolds we consider here, and rules out certain 
structures in mixed signature spaces. The unit circle in the Lorentzian plane 
(figure 6.1), for example, falls outside the class of surfaces of studied here, as 
the tangent space has vanishing norm where the tangent vectors become null. 
Of course, there is no problem in referring to a closed spacetime curve as a 
vector manifold. The problem arises when attempting to generalise the integral 
theorems of the previous sections to such spaces. 

The pseudoscalar I(x) contains all of the geometric information about the 
surface and unites both its intrinsic and extrinsic properties. As well as assuming 
that I(x) can be defined globally, we will also assume that I(x) is continuous 
and differentiable over the entire surface, that it has the same grade everywhere, 
and that it is single-valued. The final assumption implies that the manifold is 
orientable, and rules out objects such as the Mobius strip, where the pseudoscalar 
is double-valued. Many of the restrictions on the pseudoscalar mentioned above 


203 


GEOMETRIC CALCULUS 


can be relaxed to construct a more general theory, but this is only achieved at 

some cost to the ease of presentation. We will follow the simpler route, as the 

results developed here are sufficiently general for our purposes in later chapters. 
The pseudoscalar I(x) defines an operator which projects from an arbitrary 

multivector onto the component that is intrinsic to the manifold. This operator 

is 

Ar(z) I(x) I t(x) = Ap IIt, r<n 


(6.194) 
0 r>n 


P(A, (x), £) = 
which defines an operator at every point x on the manifold. It is straightforward 
to prove that P satisfies the essential requirement of a projection operator, that 
is, 

P?(A) = P(P(A)) = P(A). (6.195) 


The effect of P on a vector a is to project onto the component of a that lies 
entirely in the tangent space at the point x. Such vectors are said to be intrinsic 
to the manifold. The complement, 


P| (a) =a—P(a), (6.196) 


lies entirely outside the tangent space, and is said to be extrinsic to the manifold. 
Suppose now that A(x) is a multivector field defined over some region of the 
manifold. We do not assume that A is intrinsic to the manifold. Given a vector 
a in the tangent space, the directional derivative along a is defined in the obvious 

manner: 
a-V A(x) = lim i a A), 


er0 € 


(6.197) 


Again, the presence of the embedding enables us to write this limit without 
ambiguity. The derivative operator a-V is therefore simply the vector derivative 
in the ambient space contracted with a vector in the tangent space. Given a set of 
linearly independent tangent vectors {e;}, we can now define a vector derivative 
ô intrinsic to the manifold by 


ð= Że V =P(V). (6.198) 


This is simply the ambient space vector derivative projected onto the tangent 
space. The use of the 0 symbol should not cause confusion with the boundary 
operator introduced in section 6.4.4. The definition of 0 requires the existence 
of the reciprocal frame {ef}, which is why we restricted to manifolds over which 
I is globally defined. The projection of the vector operator ð satisfies 


P(ð) = ð. (6.199) 


The contraction of ð with a tangent vector a satisfies a- = a-V, which is simply 
the directional derivative in the a direction. 
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6.5.2 Directed integration for embedded surfaces 


Now that we have defined the ð operator it is a straightforward task to write 
down a generalized version of the fundamental theorem of calculus appropri- 
ate for embedded surfaces. We can essentially follow through the derivation 
of section 6.4.4 with little modification. The volume to be integrated over is 
again triangulated into a chain of simplices. The only difference now is that the 
pseudoscalar for each simplex varies from one simplex to another. This changes 
very little. For example we still have 


f d5 = 0, (6.200) 


which holds for the directed integral over the closed boundary of any simply- 
connected vector manifold. 

The linear interpolation results used in deriving equation (6.138) are all valid, 
because we can again fall back on the embedding picture. In addition, the 
assumption that the pseudoscalar I(x) is globally defined means that the recip- 
rocal frame required in equation (6.138) is well defined. The only change that 
has to be made is that the ambient derivative V is replaced by its projection 
into the manifold, because we naturally assemble the inner product of V with 
the pseudoscalar. The most general statement of the fundamental theorem can 
now be written as 


f Las) = i L(ôdX) = i L(W-dX). (6.201) 


The form of the volume integral involving ð is generally more useful as it forms 
a geometric product with the volume element. The function L can be any 
multivector-valued function in this equation — it is not restricted to lie in the 
tangent space. An important feature of this more general theorem is that if we 
write dX = I|dX| we see that the directed element dX is position-dependent. 
But this position dependence is not differentiated in equation (6.201). It is only 
the integrand that is differentiated. 

There are two main applications of the general theorem derived here. The first 
is a generalisation of the divergence theorem to curved spaces. We again write 


L(A) = (JAI“*), (6.202) 


where J is a vector field in the tangent space, and J is the unit pseudoscalar for 
the n-dimensional curved space. Equation (6.201) now gives 


f n-J|dS| = i (0-T +4 J0I-* DT) aX, (6.203) 
av 4 
where |dX| = I~'dX and n|dS| = dS I7!. The final term in the integral van- 
ishes, as can be shown by first writing J~' = +I and using 

(JOLT) = L(JÔ(İI + ID) = 4(Ja(1’)) = 0. (6.204) 


205 


GEOMETRIC CALCULUS 


It follows that the divergence theorem in curved space is essentially unchanged 
from the flat-space version, so 


[asiaxi=¢ n-J |ds]. (6.205) 
V OV 


As a second application we derive Stokes’ theorem in three dimensions. Sup- 
pose that a denotes an open, connected surface in three dimensions, with bound- 
ary Oo. The linear function L takes a vector as its linear argument and we define 


L(a) = J-a. (6.206) 
Equation (6.201) now gives 


Fd = [avax = - | (wasyax. (6.207) 


where the line integral is taken around the boundary of the surface, and since the 
embedding is specified we have chosen a form of the integral theorem involving 
the three-dimensional derivative V. We now define the normal vector to the 
surface by 


dX = In|dX], (6.208) 


where J is the three-dimensional (right-handed) pseudoscalar. This equation 
defines the vector n normal to the surface. The direction in which this points 
depends on the orientation of dX. Around the boundary, for example, we can 
denote the tangent vector at the boundary by l, and the vector pointing into 
the surface as m. Then dX has the orientation specified by LAm, and from 
equation (6.208) we see that l, m,n must form a right-handed set. This extends 
inwards to define the normal vector n over the surface (see figure 6.6). We now 
have 


Jdt= | CV Ad)n|ax| = f (cur J)-n|ax\, (6.209) 
ðo o o 


which is the familiar Stokes’ theorem in three dimensions. This is only the 
scalar part of a more general (and less familiar) theorem which holds in three 
dimensions. To form this result we remove the projection onto the scalar part, 
to obtain 


dl J = -r f nAV J |dX]|. (6.210) 
Oo o 
A version of this result holds for any open n-dimensional surface embedded in a 


flat space of dimension n + 1. 


6.5.3 Intrinsic and extrinsic geometry 


Suppose now that the directional derivative a- acts on a tangent vector field 
b(x) = P(b(x)). There is no guarantee that the resulting vector also lies entirely 
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ar 


Figure 6.6 Orientations for Stokes’ theorem. The bivector measure dX 
defines an orientation over the surface and at the boundary. With l and m 
the tangent and inward directions at the boundary, the normal n is defined 
so that l, m,n form a right-handed set. 


in the tangent space, even if a does. For example, consider the simple case of a 
circle in the plane. The derivative of the tangent vector around the circle is a 
radial vector, which is entirely extrinsic to the manifold. In order to restrict to 
quantities intrinsic to the manifold we define a new derivative — the covariant 
derivative D — as follows: 


a:-DA(x) = P(a-0A(a)). (6.211) 
The operator a-D acts on multivectors in the tangent space, returning a new 


multivector field in the tangent space. Since the a- operator satisfies Leibniz’s 
rule, the covariant derivative a-D must as well, 


a- D(AB) = P(a-0(AB)) = (a-DA)B + Aa- DB. (6.212) 


The vector operator D is then defined in the obvious way from the covariant 
directional derivatives, 
D= ë e:D. (6.213) 
So, for example, we can write 
DA, = e'(e;-DA,) = P(OA;). (6.214) 
The result decomposes into grade-raising and grade-lowering terms, so we write 
D- A, = A pasty 
(6.215) 
DAA, = (DA,) p41. 


So, like 0, D has the algebraic properties of a vector in the tangent space. Acting 
on a scalar function a(x) defined over the manifold the two derivatives coincide, 
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so 
Oa(x) = Da(z). (6.216) 


Suppose now that a is a tangent vector to the manifold, and we look at how 
the pseudoscalar changes along the a direction. It should be obvious, from 
considering a 2-sphere for example, that the resulting quantity must lie at least 
partly outside the manifold. We let {e;} denote an orthonormal frame, so 


I = eez +- ep. (6.217) 


It follows that 


a- IIt 


II 


Xe e (a: Dei + Pı (a-ðe;)) -€n I7! 
i=1 


II 


a-DII~' +P (a-de) Ae. (6.218) 


The final term is easily shown to be independent of the choice of frame. But 
a-DI must remain in the tangent space, so it can only be a multiple of the 
pseudoscalar I. It follows that 
(a DDI = ((a-DI)I) = 3(a-D(I”)) = 0, (6.219) 
so 
a-DI=0. (6.220) 


That is, the (unit) pseudoscalar is a covariant constant over the manifold. Equa- 
tion (6.218) now simplifies to give 
a-OI =P (a-de) Ae’ I = —S(a)I, (6.221) 


which defines the shape tensor S(a). This is a bivector-valued, linear function 
of its vector argument a, where a is a tangent vector. Since the result of a-ô I 
has the same grade as J, we can write 


a:ðI = Ix S(a) (6.222) 
with 
S(a)-I = S(a)AI =0. (6.223) 
The fact that S(a)-I = 0 confirms that S(a) lies partly outside the manifold, so 
that P(S(a)) = 0. 
The shape tensor S(a) unites the intrinsic and extrinsic geometry of the man- 
ifold in a single quantity. It can be thought of as the ‘angular momentum’ of 


I(x) as it slides over the manifold. The shape tensor provides a compact relation 
between directional and covariant derivatives. We first form 


b-S(a) = bP (a-de;) = Pı (a-8 b), (6.224) 
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where a and b are tangent vectors. It follows that 
a-b = P(a-0b) +P (a-b) =a-Db+b-S(a), (6.225) 
which we can rearrange to give the neat result 
a-Db=a-0b+ S(a)-b. (6.226) 
Applying this result to the geometric product bc we find that 
a: D(bc) = (a-0b)c + S(a)-bc + bla- c) + bS(a)-c 
= a-0(bc) + S(a) x (bc), (6.227) 


where x is the commutator product, Ax B = (AB — BA)/2. It follows that for 
any multivector field A taking its values in the tangent space we have 


a-DA=a-0A+S8(a)xA. (6.228) 


The fact that S(a) is bivector-valued ensures that S(a)x A does not alter the 
grade of A. As a check, setting A = I recovers equation (6.222). If we now write 


a-0b = a-0P(b) = a-O P(b) + P(a-db) = a-OP(b) + a- Db (6.229) 
we establish the further relation 
a-9P(b) = b-S(a). (6.230) 


This holds for any pair of tangent vectors a and b. 


6.5.4 Coordinates and derivatives 


A number of important results can be derived most simply by introducing a 
coordinate frame. In a region of the manifold we introduce local coordinates x’ 


and define the frame vectors 
Ox 


Ox 
From the definition of 0 it follows that et = Ox’. The {e;} are usually referred 
to as tangent vectors and the reciprocal frame {e’} as cotangent vectors (or 1- 
forms). The fact that the space is curved implies that it may not be possible to 
construct a global coordinate system. The 2-sphere is the simplest example of 
this. In this case we simply patch together a series of local coordinate systems. 
The covariant derivative along a coordinate vector, e; - D, satisfies 


(6.231) 


ej 


which defines the D; and S; symbols. 
The tangent frame vectors satisfy 
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Projecting this result into the manifold establishes that 
Die; a De; = 0. (6.234) 


Projecting out of the manifold we similarly establish the result 


en Sj = en Si. (6.235) 
In terms of arbitrary tangent vectors a and b this can be written as 
a: S(b) = b-S(a). (6.236) 
The shape tensor can be written in terms of the coordinate vectors as 
S(a) = e" AP, (a-de,). (6.237) 
It follows that 
Si =e* AP, (jez) = e* APL (kei). (6.238) 


The tangent vectors therefore satisfy 
Ne; = e" A (P(Oxe:) + P (Oxe:)) = D^e; + Si. (6.239) 


If we decompose a vector in the tangent space as a = ate; we establish the general 
result that 


ð^a = D^a + S(a). (6.240) 


This gives a further interpretation to the shape tensor. It is the object which 
picks up the component of the curl of a tangent vector which lies outside the 
tangent space. As we can write 


dAa = ðA (P(a)) = OAP(a) + P(3^a) = DAa + ÔA P (a), (6.241) 
we establish the further result 
^P (a) = S(a). (6.242) 


This is easily seen to be consistent with the definition of the shape tensor in 
terms of the derivative of pseudoscalar. 

If we now apply the preceding to the case of the curl of a gradient of a scalar, 
we find that 


anðg = P(V)AP(V¢) = P(VAV¢) + ÔA (Ve). (6.243) 


But the ambient derivative satisfies the integrability condition VAV = 0. It 
follows that we have 


ON0G = S(V e), (6.244) 
which lies outside the manifold. The covariant derivative therefore satisfies 


DA(Dé) =0. (6.245) 
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An important application of this result is to the coordinate scalars themselves. 
We find that 


DA(Dz') = Dae’ =0, (6.246) 


which can also be proved directly from equation (6.234). Applying this result to 
an arbitrary vector a = a;ef we find that 


D^a = DA (ajet) = ene (Oiaj) = se’ Ae! (Oia; = Ojai). (6.247) 


This demonstrates that the DA operator is precisely the exterior derivative of 
differential geometry. 


6.5.5 Riemannian geometry 


To understand further how the shape tensor can specify the intrinsic geometry 
of a surface, we now make contact with Riemannian geometry. In Riemannian 
geometry one focuses entirely on the intrinsic properties of a manifold. It is 
customary to formulate the subject using the metric tensor as the starting point. 
In terms of the {e;} coordinate frame the metric tensor is defined in the expected 
manner: 


Jij = M8. (6.248) 


In what follows we will not place any restriction on the signature of the tangent 
space. Some texts prefer to use the adjective ‘Riemannian’ to refer to extensions 
of Euclidean geometry to curved spaces (as Riemann originally intended). But 
in the physics literature it is quite standard now to refer to general relativity as 
a theory of Riemannian geometry, despite the Lorentzian signature. 

After the metric, the next main object in Riemannian geometry is the Christof- 
fel connection. The directional covariant derivative, D;, restricts the result of its 
action to the tangent space. The result of its action on one of the {e;} vectors 
can therefore be decomposed uniquely in the {e;} frame. The coefficients of this 
define the Christoffel connection by 


Tip = (Djeg) é. (6.249) 


The components of the connection are clearly dependent on the choice of coordi- 
nate system, as well as the underlying geometry. It follows that a connection is 
necessary even when working in a curvilinear coordinate system in a flat space. 
A connection on its own does not imply that a space is curved. A typical use 
of the Christoffel connection is in finding the components in the {e’} frame of a 
covariant derivative a-D b, for example. We form 


(a-Db)-e' = af (D;(b*ex,))-e° = af (O;b' + 4,0), (6.250) 


which shows how the connection accounts for the position dependence in the 
coordinate frame. 
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The components of the Christoffel connection can be found directly from the 
metric without referring to the frame vectors themselves. To achieve this we first 
establish a pair of results. The first is that the connection Tiy is symmetric on 
the jk indices. This follows from 

jk — Vig = (Djek — Dre;)-e' = 0, (6.251) 
where we have used equation (6.234). The second result is for the curl of a frame 
vector, 

Die; = DA(gije’) = (Dgij) A. (6.252) 
We can now write 
e. (Djek + Drej) 

‘.(ej-( (Damne! ) + ep: (Dgj^e!) + Dagjx) 

(jga + kge — Dgjr) 
g” (Oj gx + Ongar — 1gjk), (6.253) 
which recovers the familiar definition of the Christoffel connection. 

We now seek a method of encoding the intrinsic curvature of a Riemannian 
manifold. Suppose we form the commutator of two covariant derivatives 

(Di, D,\A = 0;(0;A + Sj xA) + Six (3j A + Sj x A) 
—0; (ð; A + Si x A) me Sj x (ð; A + Si x A) 


i 
jk 


oO 
= 


NIe NIe NIe Nie 


where we have used the Jacobi identity of section 4.1.3. Remarkably, all deriva- 
tives of the multivector A have cancelled out and what remains is a commutator 
with a bivector. To simplify this we form 


ORF = 0; Si = —0;(0;1 i) + 0; (ðL I~") 
= -jI SI + S,1S;17+ 
SSG G, (6.255) 


where we have used the fact that S(a) anticommutes with J. On substituting 
this result in equation (6.254) we obtain the simple result 


[D;, Dj] A = —(S;x Sj) x A. (6.256) 
The commutator of covariant derivatives defines the Riemann tensor. We denote 
this by R(aAb), where 

R(e;Ae;) x. A = [D;, D;]A. (6.257) 
R(aAb) is a bivector-valued linear function of its bivector argument. In terms of 


the shape tensor we have 


R(aAb) = P(S(b)AS(a)). (6.258) 
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The projection is required here because the Riemann tensor is defined to be 
entirely intrinsic to the manifold. The Riemann tensor (and its derivatives) fully 
encodes all of the local intrinsic geometry of a manifold. Since it can be derived 
easily from the shape tensor, it follows that the shape tensor also captures all 
of the intrinsic geometry. In addition to this, the shape tensor tells us about 
the extrinsic geometry — how the manifold is embedded in the larger ambient 
space. 

The Riemann tensor can also be expressed entirely in terms of intrinsic quan- 
tities. To achieve this we first write 


R(e;^e;) ex = [Di, Dilex = Di(T Skea) — Dj T ikea). (6.259) 
It follows that 
Risa = R(e;Ae;) (ex Ae’) 
= OTS, — OT, + rela -rT 


E (6.260) 
recovering the standard definition of Riemannian geometry. An immediate ad- 
vantage of the geometric algebra route is that many of the symmetry properties 
of Rijk follow immediately from the fact that R(a^b) is a bivector-valued lin- 
ear function of a bivector. This immediately reduces the number of degrees of 
freedom to n? (n — 1)? /4. 


A further symmetry of the Riemann tensor can be found as follows: 
R(e;Ae;) ex = D;Djer = D;Diex 
= D, Dye; = D; Dye: 
= (Di, Dye; = [D;, Dyle: + D,(Die; = Djei) 
= R(e;Aex)-e; — R(ej Aex)-e- (6.261) 
It follows that 
a-R(bAc) + e R(a^b) + b-R(cAa) = 0, (6.262) 
for any three vectors a, b, c in the tangent space. This equation tells us that 
a vector quantity vanishes for all trivectors aA b/c, which provides a set of 
n?(n — 1)(n — 2)/6 scalar equations. The number of independent degrees of 
freedom in the Riemann tensor is therefore reduced to 
a! 
12 
This gives the values 1, 6 and 20 for two, three and four dimensions respectively. 


n? (n —1)(n — 2) 


n? (n? — 1). (6.263) 


Further properties of the Riemann tensor are covered in more detail in later 
chapters, where in particular we are interested in its relevance to gravitation. 
The fact that Riemannian geometry is founded on the covariant derivative 
D, as opposed to the projected vector derivative ð limits the application of the 
integral theorem of equation (6.201). If one attempts to add multivectors from 
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different points in the surface, there is no guarantee that the result remains 
intrinsic. The only quantities that can be combined from different points on the 
surface are scalars, or functions taking their values in a different space (such as a 
Lie group). The most significant integral theorem that remains is a generalization 
of Stokes’ theorem, applicable to a grade-r multivector A, and an open surface o 
of dimension r + 1. For this case we have 


fa dS = | (ArA@)-dX = (-1)" f (Daa,)-dX, (6.264) 


which only features intrinsic quantities. A particular case of this is when r = 
n— 1, which recovers the divergence theorem. This is important for constructing 
conservation theorems in curved spaces. 


6.5.6 Transformations and maps 


The study of maps between vector manifolds helps to clarify some of the re- 
lationships between the structures defined in this chapter and more standard 
formulations of differential geometry. Suppose that f(x) defines a map from one 
vector manifold to another. We denote these M and M’, so that 


z' = f(z) (6.265) 


associates a point in the manifold M’ with one in M. We will only consider 
smooth, differentiable, invertible maps between manifolds. In the mathematics 
literature these are known as diffeomorphisms. These are a subset of the more 
general concept of a homeomorphism, which maps continuously between spaces 
without the restriction of smoothness. Somewhat surprisingly, these two con- 
cepts are not equivalent. It is possible for two manifolds to be homeomorphic, 
but not admit a diffeomorphism between them. This implies that it is possible 
for a single topological space to admit more than one differentiable structure. 
The first example of this to be discovered was the sphere S7, which admits 28 dis- 
tinct differentiable structures! In 1983 Donaldson proved the even more striking 
result that four-dimensional space R* admits an infinite number of differentiable 
structures. 

A path in M, x(A), maps directly to a path in M’. The map accordingly 
induces a map between tangent vectors, as seen by forming 


ax (A) _ Af (#()) 
OX AA 
where v is the tangent vector in M, v = O,2(\) and the linear function f is 
defined by 


= f(v), (6.266) 


f(a) = a-3 f(x) = f(a; x). (6.267) 


The function f(a) takes a tangent vector in M as its linear argument, and returns 
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the image tangent vector in M’. If we denote the latter by a’, and write out the 
position dependence explicitly, we have 


a'(x') = f(a(x); £). (6.268) 


This map is appropriate for tangent vectors, so applies to the coordinate frame 
vectors {e;}. These map to an equivalent frame for the tangent space to M’, 


e; = f(e;). (6.269) 
The reciprocal frame in the transformed space is therefore given by 
e” = fF He’). (6.270) 


The fact that the map x +> f(x) is assumed to be invertible ensures that the 
adjoint function f(a) is also invertible. 

Under transformations, therefore, vectors in one space can transform in two 
different ways. If they are tangent vectors they transform under the action 
of f(a). If they are cotangent vectors they transform under action of f~!(a). 
In differential geometry it is standard practice to maintain a clear distinction 
between these types of vectors, so one usually thinks of tangent and cotangent 
vectors as lying in separate linear spaces. The contraction relation e’-e; = õi 
identifies the spaces as dual to each other. This relation is metric-independent 
and is preserved by arbitrary diffeomorphisms. These maps relate differentiable 
manifolds, and two diffeomorphic spaces are usually viewed as the same manifold. 

A metric is regarded as an additional construct on a differentiable manifold, 
which maps between the tangent and cotangent spaces. In the vector manifold 
picture this map is achieved by constructing the reciprocal frame using equa- 
tion (4.94). In using this relation we are implicitly employing a metric in the 
contraction with the pseudoscalar. For the theory of vector manifolds it is there- 
fore useful to distinguish objects and operations that transform simply under 
diffeomorphisms. These will define the metric-independent features of a vector 
manifold. Metric-dependent quantities, like the Riemann tensor, invariably have 
more complicated transformation laws. 

The exterior product of a pair of tangent vectors transforms as 


e; Ae; b> f(e;) Af(e;) = f(e;^e;). (6.271) 
For example, if I’ is the unit pseudoscalar for M’ we have 
f(T) = det (f)I’ (6.272) 


and for invertible maps we must have det (f) Æ 0. Similarly, for cotangent vectors 
we see that 


enet +> f (eaf t (et) = f lenet). (6.273) 


So exterior products of like vectors give rise to higher grade objects in a manner 
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that is unchanged by diffeomorphisms. Metric invariants are constructed from 
inner products between tangent and cotangent vectors. Since the derivative of a 
scalar field is 


06 = Ádige, (6.274) 
we see that O¢ is a cotangent vector, and we can write 
a’ = f+ (ð). (6.275) 
A similar result holds for the covariant derivative D. If a is a tangent vector the 
directional derivative of a scalar field a-O¢ is therefore an invariant, 
a’-O'¢' = f(a)-F-!(0)¢ = a-O¢, (6.276) 
where ¢/(x’) = ¢(2). 


In constructing the covariant derivative in section 6.5.3, we made use of the 
projection operation P(a). This is a metric operation, as it relies on a contraction 
with I. Hence the covariant derivatives Dje; do depend on the metric (via the 
connection). To establish a metric-independent operation we let a and b represent 
tangent vectors and form 


a-b — b-0a = a- Db — b-Da + a- S(b) — b-S(a) 
= a: Db — b- Da. (6.277) 


The shape terms cancel, so the result is intrinsic to the manifold. Under a 
diffeomorphism the result transforms to 


a-Of(b) — b-Əf(a) = f(a-3b — b-Oa) + a-OF(b) — b-OF (a). (6.278) 
But f(a) is the differential of the map f(x), so we have 
(0,0; — 0;0;) f(x) = O;F(e;) — 0;F(e;) = Ö;t(e;) — jf (e;) = 0. (6.279) 
It follows that, for tangent vectors a and b, 
a-Of(b) — b-OFf(a) = 0. (6.280) 
We therefore define the Lie derivative Lab by 
Lab = a-0b — b- ða. (6.281) 


This results in a new tangent vector, and transforms under diffeomorphisms as 
Lab Lib’ = f(Lab). (6.282) 


Relations between tangent vectors constructed from the Lie derivative will there- 
fore be unchanged by diffeomorphisms. 

A similar construction is possible for cotangent vectors. If we contract equa- 
tion (6.279) with f-1(e*) we obtain 


f(e;)- (O;F-*(e*)) — f(e:)- (A;F -*(e*)) = 0. (6.283) 
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Now multiplying by f~'(e’Ae’) and summing we find that 
P’(f-(O)AF-1(e*)) =0. (6.284) 
This result can be summarised simply as 
D'a = D'A! (e") = 0. (6.285) 


This is sufficient to establish that the exterior derivative of a cotangent vector 
results in a cotangent bivector (equivalent to a 2-form). The result transforms 
in the required manner: 


DAAw® D'^A' = F7-1(DAA). (6.286) 


This is the result that makes the exterior algebra of cotangent vectors so powerful 
for studying the topological features of manifolds. This algebra is essentially that 
of differential forms, as is explained in section 6.5.7. For example, a form is said 
to be closed if its exterior derivative is zero, and to be exact if it can be written 
as the exterior derivative of a form of one degree lower. Both of these properties 
are unchanged by diffeomorphisms, so the size of the space of functions that are 
closed but not exact is a topological feature of a space. This is the basis of de 
Rham cohomology. 

It is somewhat less common to see diffeomorphisms discussed when studying 
Riemannian geometry. More usually one focuses attention on the restricted class 
of isometries, which are diffeomorphisms that preserve the metric. These define 
symmetries of a Riemannian space. In the vector manifold setting, however, it is 
natural to study the effect of maps on metric-dependent quantities. The reason 
being that vector manifolds inherit their metric structure from the embedding, 
and if the embedding is changed by a diffeomorphism, the natural metric is 
changed as well. One does not have to inherit the metric from an embedding. 
One can easily impose a metric on a vector manifold by defining a linear transfor- 
mation over the manifold. This takes us into the subject of induced geometries, 
which is closer to the spirit of the approach to gravity adopted in chapter 14. 
Similarly, when transforming a vector manifold, one need not insist that the 
transformed metric is that inherited by the new embedding. One can instead 
simply define a new metric on the transformed space directly from the original 
one. 

The simplest example of a diffeomorphism inducing a new geometry is to 
consider a flat plane in three dimensions. If the plane is distorted in the third 
direction, and the new metric taken as that implied by the embedding, the surface 
clearly becomes curved. Formulae for the effects of such transformations are 
generally quite complex. Most can be derived from the transformation properties 
of the projection operation, 


P’ = fpf}, (6.287) 
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This identity ensures that the projection and transformation formulae can be 
applied in either order. If we now form 


= f(e;-S;) + P’ (0;F(e:)), (6.288) 
we see that the shape tensor transforms according to 
a’-S'(b') = f(a-S(b)) + P’ (b-OF(a)). (6.289) 


Further results can be built up from this. For example, the new Riemann tensor 
is constructed from the commutator of the transformed shape tensor. 


6.5.7 Differential geometry and forms 


So far we have been deliberately loose in relating objects in vector manifold the- 
ory to those of modern differential geometry texts. In this section we clarify the 
relations and distinctions between the viewpoints. In the subject of differential 
geometry it is now common practice to identify directional derivatives as tangent 
vectors, so that the tangent vector a is the scalar operator 


4 ð 

a= ala 
Tangent vectors form a linear space, denoted TyM, where x labels a point in 
the manifold M. This notion of a tangent vector is slightly different from that 
adopted in the vector manifold theory, where we explicitly let the directional 
derivative act on the vector x. As explained earlier, the limit implied in writing 
Ox/Ax' is only well defined if an embedding picture is assumed. The reason 
for the more abstract definition of a tangent vector in the differential geometry 
literature is to remove the need for an embedding, so that a topological space 


(6.290) 


can be viewed as a single distinct entity. There are arguments in favour, and 
against, both viewpoints. For all practical purposes, however, the philosophies 
behind the two viewpoints are largely irrelevant, and calculations performed in 
either scheme will return the same results. 

The dual space to TyM is called the cotangent space and is denoted T*¥ M. 
Elements of T*M are called cotangent vectors, or 1-forms. The inner product 
between a tangent and cotangent vector can be written as (w,a). A basis for the 
dual space is defined by the coordinate differentials dx’, so that 


(da’,0/Ox?) = 55. (6.291) 


A 1-form therefore implicitly contains a directed measure on a manifold. So, if 
a is a 1-form we have 


a = adz’ = A- (dx), (6.292) 
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where A is a grade-1 multivector in the vector manifold sense. Similarly, if dX 
is a directed measure over a two-dimensional surface, we have 


dX = eiñe; dz’ da, (6.293) 
so that 
(ei Ae’) -dX = dz’ dx? — dx! da’. (6.294) 
An arbitrary 2-form can be written as 
1 


az = 5 Oi (dz dx? — da? dx’) = Al -dX. (6.295) 
Here Ag is the multivector 
1 mse 
Ao = a e’ Ae’, (6.296) 


which has the same components as the differential form. More generally, an 
r-form a, can be written as 


ar = Al-dX, = A,-dX}. (6.297) 


Clearly there is little difference in working with the r-form a, or the equivalent 
multivector Ar. So, for example, the outer product of two 1-forms results in the 
2-form 


aA bı = aibilê Ne?) - dX} = (A,AB,)-dXh, (6.298) 


where dX» is a two-dimensional surface measure and A;, Bı are the grade-1 
multivectors with components a; and ĝ; respectively. Similarly, the exterior 
derivative of an r-form is given by 


doy = (DAA,)-dX1,,. (6.299) 


The fact that forms come packaged with an implicit measure allows for a 
highly compact statement of Stokes’ theorem, as given in equation (6.264). In 
ultra-compact notation this says that 


f a= Q, (6.300) 
Or Oo, 


where a is an (r — 1)-form integrated over an open r-surface op. This is entirely 
equivalent to equation (6.264), as can be seen by writing 


f da= f (At 1 AD)dX, (Ata) dS = E a. (6.301) 


Oo, Oo, 
One can proceed in this manner to establish a direct translation scheme between 
the languages of differential forms and vector manifolds. Many of the expressions 
are so similar that there is frequently little point in maintaining a distinction. 
If the language of differential forms is applied in a metric setting, an important 
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additional concept is that of a duality transformation, also known as the Hodge 
* (star) operation. To define this we first introduce the volume form 


Q = J|g\dx' Adz? A---Adx™ = y/|g|(e netr- Ae?) dX. (6.302) 


The pseudoscalar for a vector manifold, given a coordinate frame with the spec- 
ified orientation, is given by 


1 
I= (e1 Aen A:++A@n). (6.303) 
Vig 
This definition was chosen earlier to ensure that J? = +1 and that I keeps the 


orientation specified by the frame. It follows that 


Q=I-1-dX, (6.304) 


so that the equivalent multivector is J~''. This will equal +7, depending on 
signature. The Hodge * of an r-form a, is the (n — r)-form 


Vigl ee i 
sa = ei tM ganga MAM Ada, (6.308) 
where €;,,...,;,, denotes the alternating tensor. If A, is the multivector equivalent 


+A, = (ITA) = (171.4, )*. (6.306) 


In effect, we are multiplying by the pseudoscalar, as one would expect for a 
duality relation. Applied twice we find that 


A, = (ITULA = (yr) AT). (6.307) 


In spaces with Euclidean signature, II = +1. In spaces of mixed signature 
the sign depends on whether there are an even or odd number of basis vectors 
with negative norm. It is a straightforward exercise to prove the main results for 
the Hodge x operation, given equation (6.307) and the fact that I is covariantly 
conserved. 


6.6 Elasticity 


As a more extended application of some of the ideas developed in this chapter, 
we discuss the foundations of the subject of elasticity. The behaviour of a solid 
object is modelled by treating the object as a continuum. Locally, the strains 
in the object will tend to be small, but these can build up to give large global 
displacements. As such, it is important to treat the full, non-linear theory of 
elasticity. Only then can one be sure about the validity of various approximation 
schemes, such as assuming small deflections. 
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Our discussion is based on a generalisation of the ideas employed in the treat- 
ment of a rigid body. We first introduce an undeformed, reference configuration, 
with points in this labelled with the vector x. This is sometimes referred to as 
the material configuration. Points in the spatial configuration, y, are obtained 
by a non-linear displacement f of the reference configuration, so that 


y = y(x, t) = f(a,t). (6.308) 


We use non-bold vectors to label points in the body, and bold to label tangent 
vectors in either the reference or spatial body. We assume that the background 
space is flat, three-dimensional Euclidean space. 


6.6.1 Body strains 


To calculate the strains in the body, consider the image of the vector between 
two nearby points in the reference configuration, 


(x + ca) — zx y(x + ca) — y(x) = ef(a) + O(e?), (6.309) 
where f is the deformation gradient, 
fia) =a-Vy=a- Vfi(z,t). (6.310) 


The function f maps a tangent vector in the reference configuration to the equiva- 
lent vector in the spatial configuration. That is, if z(A) is a curve in the reference 
configuration with tangent vector 


Ox(A) 

A — 

A 
then the spatial curve has tangent vector f(v). The length of the curve æ(A) in 
the reference configuration is 


J> 
Or 
The length of the induced curve in the spatial configuration is therefore 
[oten = farce a’). (6.313) 
We define the (right) Cauchy—Green tensor C, by 
C(a) = ff(a). (6.314) 


(6.311) 


d\ = fie’ dd. (6.312) 


This tensor is a symmetric, positive-definite map between vectors in the reference 
configuration. It describes a set of positive dilations along the principal directions 
in the reference configuration. The eigenvalues of C can be written as (A7, A3, \3), 
where the A; define the principal stretches. The deviations of these from unity 
measure the strains in the material. 
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e1 


Figure 6.7 An elastic body. The function f(x,t) maps points in the refer- 
ence configuration to points in the spatial configuration. Coordinate curves 
e; and e2 map to f(e1) and f(e2). The normal vector in the spatial config- 
uration therefore lies in the f~'(e®) direction. 


6.6.2 Body stresses 


If we take a cut through the body then the contact force between the surfaces will 
be a function of the normal to the surface (and position in the body). Cauchy 
showed that, under reasonable continuity conditions, this force must be a linear 
function of the normal, which we write o(n) = o(n;x). The tensor o(m) maps 
a vector normal to a surface in the spatial configuration onto the force vector, 
also in the spatial configuration. We will verify shortly that o is symmetric. 

The total force on a volume segment in the body involves integrating a(n) over 
the surface of the volume. But, as with the rigid body, it is simpler to perform 
all calculations back in the reference copy. To this end we let zê denote a set of 
coordinates for position in the reference body. The associated coordinate frame 
is {e;}, with reciprocal frame {ef}. Suppose now that x! and z? are coordinates 
for a surface in the reference configuration. The equivalent normal in the spatial 
configuration is (see figure 6.7) 


n = f(e1)Af(e2) I7! = det (f) f~*(e?). (6.315) 

The force over this surface is found by integrating the quantity 
a (f(e;Aeg)I~') da’ dz? = det (f)o(f7"(e%))da! dz?. (6.316) 

We therefore define the first Piola—Kirchoff stress tensor T by 
T(a) = det (f)of—'(a). (6.317) 


The stress tensor T takes as its argument a vector normal to a surface in the 
reference configuration, and returns the contact force in the spatial body. The 
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force balance equation tells us that, for any sub-body, we have 


5, [ee pv = f Tas) + [ae pb, (6.318) 


where p is the density in the reference configuration, v = y is the spatial velocity, 
and 6 is the applied body force. The fundamental theorem immediately converts 
this to the local equation 


po = T(V) + pb. (6.319) 

The check symbol is used for the scope of the derivative, to avoid confusion with 

time derivatives (denoted with an overdot). This equation is sensible as V is 

the vector derivative in the reference configuration, and T(WV) is a vector in the 
spation configuration. 

The total torque on a volume element, centred on yo, is (ignoring body forces) 


M= fo — yo) AT (ds). (6.320) 


This integral runs over the reference body, and returns a torque in the spatial 
configuration. This must be equated with the rate of change of angular momen- 
tum, which is 


d . ee 
A [ae p(y — yo) AY = [ae (y — yo) AT(V) 
= fo — yo) ^T (ds) — [ae YAT(V). (6.321) 
Equating this with M we see that 
YAT(V) = (O;f (x))AT(e’) = f(e) AT (et) = 0. (6.322) 
It follows that 
f(e;)\T(e’) = det (f) f(e) Aof! (et) = 0, (6.323) 
and we see that o must be a symmetric tensor in order for angular momentum 
to be conserved. 
It is often convenient to work with a version of T that is symmetric and defined 


entirely in the material frame. We therefore define the second Piola—Kirchoff 
stress tensor T by 


T(a) =f 'T(a). (6.324) 


It is meaningless to talk about symmetries of T, since it maps between differ- 
ent spaces, whereas 7 is defined entirely in the reference configuration and, by 
construction, is symmetric. 

The equations of motion for an elastic material are completed by defining a 
constitutive relation. This relates the stresses to the strains in the body. These 
relations are most easily expressed in the reference copy as a relationship between 
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T and C. There is no universal definition of the strain tensor E, though for certain 
applications a useful definition is 


Ela) = C1? (a) — a. (6.325) 


This tensor is zero if the material is undeformed. Linear materials have the prop- 
erty that 7 and € are linearly related by a rank-4 tensor. This can, in principle, 
have 36 independent degrees of freedom, all of which may need to be determined 
experimentally. If the material is homogeneous then the components of the rank- 
4 tensor are constants. If the material is also isotropic then the 36 degrees of 
freedom reduce to two. These are usually given in terms of the bulk modulus B 
and shear modulus G, with 7 and € related by an expression of the form 


T (a) = 2GE(a) + (B — 2G)tr(E)a. (6.326) 


In many respects this is the simplest material one can consider, though even in 
this case the non-linearity of the force law makes the full equations very hard to 
analyse. The analysis can be aided by the fact that these materials are described 
by an action principle, as discussed in section 12.4.1. 


6.7 Notes 


The treatment of vector manifolds presented here is a condensed version of the 
theory developed by Hestenes & Sobczyk in the book Clifford Algebra to Geo- 
metric Calculus (1984) and in a series of papers by Garret Sobczyk. There are 
a number of differences in our presentation, however. Most significant is our 
definition of the orientations in the fundamental theorem of integral calculus. 
Our definition of the boundary operator ensures that a boundary inherits its 
orientation from the directed volume measure. Hestenes & Sobczyk used the 
opposite specification for their boundary operator, which gives rise to a number 
of (fairly trivial) differences. A significant advantage of our conventions is that 
in two dimensions the pseudoscalar has the correct orientation implied by the 
imaginary in the Cauchy integral formula. 

A further difference is that from the outset we have emphasised both the 
implied embedding of a vector manifold, and the fact that this gives rise to a 
metric. A vector manifold thus has greater structure than a differentiable man- 
ifold in the sense of differential geometry. For applications to finite-dimensional 
Riemannian geometry the different approaches are entirely equivalent, as any 
finite-dimensional Riemannian manifold can be embedded in a larger dimen- 
sional flat space in such a way that the metric is generated by the embedding. 
This result was proved by John Nash in 1956. His remarkable story is the subject 
of the book A Beautiful Mind by Sylvia Nasar (1998) and, more recently, a film 
of the same name. In other applications of differential geometry the full range 
of validity of the vector manifold approach has yet to be fully established. The 
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approach certainly does give streamlined proofs of a number of key results. But 
whether this comes with some loss of generality is an open question. 

A final, small difference in our approach here to the original one of Hestenes & 
Sobczyk is our definition of the shape tensor. We have only considered the shape 
tensor S(a) taking intrinsic vectors as its linear argument. This concept can be 
generalised to define a function that can act linearly on general vectors. One of 
the most interesting properties of this generalized version of the shape tensor is 
that it provides a natural square root of the Ricci tensor. This theory is developed 
in detail in chapter 5 of Clifford Algebra to Geometric Calculus, to which readers 
are referred for further information. There is no shortage of good textbooks on 
modern differential geometry. The books by Nakahara (1990), Schutz (1980) 
and Gockeler & Schucker (1987) are particularly strong on emphasising physical 
applications. Elasticity is described in the books by Marsden & Hughes (1994) 
and Antman (1995). 


6.8 Exercises 


6.1 Confirm that the vector derivative is independent of the choice of coor- 
dinate system. 

6.2 If we denote the curl of a vector field J in three dimensions by Vx J, 
show that 


VxJ=—-IVAJ. 
Hence prove that 
V-(VxJ) =0, 
Vx(VxJ)=V(V-J)— VS. 
6.3 An oblate spheroidal coordinate system can be defined by 
acosh(u) sin(v) = (xz? + 4°), 
asinh(u) cos(v) = z, 
tan(¢) = y/2, 


where (x,y,z) denote standard Cartesian coordinates and a is a scalar. 
Prove that 


e2 =e, = a? (sinh? (u) + cos? (v)) = 2°, 


which defines the quantity p. Hence prove that the Laplacian becomes 
1 o o 1 o o 
Vu => FF (cosh) £) kez (aE) 


cosh(u) Ou p? sin(v) ðv ðv 
1 oy 


a? cosh? (u) sin? (v) 06?’ 
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6.4 


6.5 


6.6 


6.7 


6.8 


6.9 


and investigate the properties of separable solutions in oblate spheroidal 
coordinates. 
Prove that over the surface of a tetrahedron the directed surface integral 


satisfies 
f dS = 0. 


By considering pairs of adjacent tetrahedra, prove that this integral 
vanishes for all orientable, connected closed surfaces. 
For a circle in a plane confirm that the line integral around the perimeter 


satisfies 
foz dl = b- A, 


where A is the oriented area of the circle. 
Prove that 
k l 1 
So (-1)'b- (xo + -Ži coef In) A(#i) (k-1) = ge (ean Nen), 


i=0 
where the notation follows section 6.4.4. 

Suppose that o is an n-dimensional surface embedded in a flat space of 
dimensions n + 1 with (constant) unit pseudoscalar I. Prove that 


dSJ = -f IAV J |dX], 
Oo o 


where the normal / is defined by dX = I1|dX|. 
The shape tensor is defined by 
a:OI = IS(a) = Ix S(a). 

Prove that the shape tensor satisfies 

a- S(b) = b-S(a) 
and 

JAP(a) = S(a), 
where P projects into the tangent space, and a and b are tangent vectors. 
An open two-dimensional surface in three-dimensional space is defined 
by 

r(x, y) = rey + yez + a(r)es, 


where r = (a? + y?)'/? and the {e;} are a standard Cartesian frame. 


Prove that the Riemann tensor can be written 


ala” 


n ahh, 
r(1 + a^)? 


R(a^b) = 
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6.10 


6.11 


where the primes denote differentiation with respect to r. The scalar 
factor «x in R(aAb) = KaAb is called the Gaussian curvature. 

A linear, isotropic, homogeneous material is described by a bulk modulus 
B and shear modulus G. By linearising the elasticity equations, show 
that the longitudinal and transverse sound speeds vw and v; are given by 


1 
2. 
vp = = 


2 
por es vi T 
Consider an infinite linear, isotropic, homogeneous material containing 
a spherical hole into which air is pumped. Show that, in the linearised 
theory, the radial stress 7, is related to the radius of the hole r by 
Tr x r73. Discuss how the full non-linear theory might modify this 
result. 
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Classical electrodynamics 


Geometric algebra offers a number of new techniques for studying problems in 
electromagnetism and electrodynamics. These are described in this chapter. We 
will not attempt a thorough development of electrodynamics, which is a vast 
subject with numerous specialist areas. Instead we concentrate on a number 
of selected applications which highlight the advantages that geometric algebra 
can bring. There are two particularly significant new features that geometric 
algebra adds to traditional formulations of electrodynamics. The first is that, 
through employing the spacetime algebra, all equations can be studied in the 
appropriate spacetime setting. This is much more transparent than the more 
traditional approach based on a 34+ 1 formulation involving retarded times. The 
spacetime algebra simplifies the study of how electromagnetic fields appear to 
different observers, and is particularly powerful for handling accelerated charges 
and radiation. These results build on the applications of spacetime algebra 
described in section 5.5.3. 

The second major advantage of the geometric algebra treatment is a new, 
compact formulation of Maxwell’s equations. The spacetime vector derivative 
and the geometric product enable us to unite all four of Maxwell’s equations 
into a single equation. This is one of the most impressive results in geometric 
algebra. And, as we showed in chapter 6, this is more than merely a cosmetic 
exercise. The vector derivative is invertible directly, without having to pass via 
intermediate, second-order equations. This has many implications for scattering 
and propagator theory. Huygen’s principle is encoded directly, and the first-order 
theory is preferable for numerical computation of diffraction effects. In addition, 
the first-order formulation of electromagnetism means that plane waves are easily 
handled, as are their polarisation states. 
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7.1 Maxwell’s equations 


Before writing down the Maxwell equations, we remind ourselves of the notation 
introduced in chapter 5. We denote an orthonormal spacetime frame by {7,, }, 
with coordinates x, = y,-x. The spacetime vector derivative is 


O 


— aH pede 
V=7 Ou, On = ark’ (7.1) 

The spacetime split of the vector derivative is 
V0 = (7° + ¥'0i)¥0 = & — 010; = ô, — V, (7.2) 


where the o; = yiyo denote a right-handed orthonormal frame for the relative 
space defined by the timelike vector yo. The three-dimensional vector derivative 
operator is 


and all relative vectors are written in bold. 
The four Maxwell equations, in SI units, are 


V D=p, V- B=0, 
ð a (7.4) 
Vx a? Vx at + J, 
where 
D = &oE +P, 


Ha 1 BM. (7.5) 
Ho 

and the x symbol denotes the vector cross product. The cross product is ubiq- 

uitous in electromagnetic theory, and it will be encountered at various points in 

this chapter. To avoid any confusion, the commutator product (denoted by x) 

will not be employed in this chapter. 

The first step in simplifying the Maxwell equations is to assume that we are 
working in a vacuum region outside isolated sources and currents. We can then 
remove the polarisation and magnetisation fields P and M. We also replace the 
cross product with the exterior product, and revert to natural units (c = €9 = 
Lo = 1), so that the equations now read 


V-E=p, V-B=0, 


7.6 
VAE = -0,(IB), VAB = I(J +E). oe 


We naturally assemble equations for the separate divergence and curl parts of 
the vector derivative. We know that there are many advantages in uniting these 
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into a single equation involving the vector derivative. First we take the two 
equations for E and combine them into the single equation 


VE = p—0,(IB). (7.7) 
A similar manipulation combines the B-field equations into 
VIB) = -J — 8, E, (7.8) 


where we have multiplied through by J. This equation is a combination of 
(spatial) bivector and pseudoscalar terms, whereas equation (7.7) contains only 
scalar and vector parts. It follows that we can combine all of these equations 
into the single multivector equation 


V(E+/B)+0(E+IB)=p-J. (7.9) 


This is already a significant compactification of the original equations. We have 
not lost any information in writing this, since each of the separate Maxwell 
equations can be recovered by picking out terms of a given grade. 

In section 5.5.3 we introduced the Faraday bivector F. This represents the 
electromagnetic field strength and is defined by 


F=E+IB. (7.10) 


The combination of relative vectors and bivectors tells us that this quantity is a 

spacetime bivector. Many authors have noticed that the Maxwell equations can 

be simplified if expressed in terms of the complex quantity E + iB. The reason 

is that the spacetime pseudoscalar has negative square, so can be represented by 

the unit imaginary for certain applications. It is important, however, to work 

with J in the full spacetime setting, as J anticommutes with spacetime vectors. 
In terms of the field strength the Maxwell equations reduce to 


VF+OF =p-—J. (7.11) 


We now wish to convert this to manifestly Lorentz covariant form. We introduce 
the spacetime current J, which has 


p=d-%, J = J^. (7.12) 
It follows that 
p- J = y J+A = od. (7.13) 


But we know that ô + V = yoV. We can therefore pre-multiply equation (7.11) 
by yo to assemble the covariant equation 


VF=J. (7.14) 


This unites all four Maxwell equations into a single spacetime equation based on 
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the geometric product with the vector derivative. An immediate consequence is 
seen if we multiply through by V, giving 


V?F=VJ=V- J+VAJ. (7.15) 

Since V? is a scalar operator, the left-hand side can only contain bivector terms. 
It follows that the current J must satisfy the conservation equation 

VJ=—4+V-J=0. (7.16) 


This equation tells us that the total charge generating the fields must be con- 
served. 

The equation VF = J separates into a pair of spacetime equations for the 
vector and trivector parts, 


V- F=J, VAF =0. (7.17) 
In tensor language, these correspond to the pair of spacetime equations 
o, F” = J”, VP? p Fug = 0. (7.18) 


These two tensor equations are as compact a formulation of the Maxwell equa- 
tions as tensor algebra can achieve, and the same is true of differential forms. 
Only geometric algebra enables us to combine the Maxwell equations (7.17) into 
the single equation VF = J. 


7.1.1 The vector potential 
The fact that VAF = 0 tells us that we can introduce a vector field A such that 


F=VAA. (7.19) 


The equation VAF = VAVAA = 0 then follows automatically. The field 
A is known as the vector potential. We shall see in later chapters that the 
vector potential is key to the quantum theory of how matter interacts with 
radiation. The vector potential is also the basis for the Lagrangian treatment of 
electromagnetism, described in chapter 12. 

The remaining source equation tells us that the vector potential satisfies 


V:(VAA) = V?A-V(V-A) = J. (7.20) 


There is some residual freedom in A beyond the restriction of equation (7.19). 
We can always add the gradient of a scalar field to A, since 


VA(A+VA) = VAA + VA(VA) =F. (7.21) 


For historical reasons, this ability to alter A is referred to as a gauge freedom. 
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Before we can solve the equations for A, we must therefore specify a gauge. A 
natural way to absorb this freedom is to impose the Lorentz condition 


V-A=0. (7.22) 


This does not totally specify A, as the gradient of a solution of the wave equation 
can still be added, but this remaining freedom can be removed by imposing 
appropriate boundary conditions. The Lorentz gauge condition implies that 
F = VA. We then recover a wave equation for the components of A, since 


Vi = Aaja: (7.23) 


One route to solving the Maxwell equations is to solve the associated wave equa- 
tion V?A = J, with appropriate boundary conditions applied, and then compute 
F at the end. In this chapter we explore alternative, more direct routes. 

The fact that a gauge freedom exists in the formulation in terms of A suggests 
that some conjugate quantity should be conserved. This is the origin of the 
current conservation law derived in equation (7.16). Conservation of charge is 
therefore intimately related to gauge invariance. A more detailed understanding 
of this will be provided by the Lagrangian framework. 


7.1.2 The electromagnetic field strength 


In uniting the Maxwell equations we introduced the electromagnetic field strength 
F = E+IB. This is a covariant spacetime bivector. Its components in the {7#} 
frame give rise to the tensor 


BMY = (94 BF) = (Aq) F. (7.24) 


These are the components of a rank-2 antisymmetric tensor which, written out 
as a matrix, has entries 


0 “aby -E, -E, 
Es 0 -B, B 
E, B, 0 -B, 
E, -B B, 0 


Fey = (7.25) 


This matrix form of the field strength is often presented in textbooks on relativis- 
tic electrodynamics. It has a number of disadvantages. Amongst these are that 
Lorentz transformations cannot be handled elegantly and the natural complex 
structure is hidden. 

Writing F = E + IB decomposes F into the sum of a relative vector E and 
a relative bivector IB. The separate E and IB fields are recovered from 


E = 3(F —0F 9), 


; (7.26) 
IB = 3(F + wF). 
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This shows clearly how the split into E and IB fields depends on the observer 
velocity (yo here). Observers in relative motion see different fields. For example, 
suppose that a second observer has velocity v = RyR and constructs the rest 
frame basis vectors 


Vy = Ry. (7.27) 
This observer measures components of an electric field to be 
E; = (iW) F = (Roi R)-F = 0;-(RFR). (7.28) 


The effect of a Lorentz transformation can therefore be seen by taking F to 
RFR. The fact that bivectors are subject to the same rotor transformation law 
as vectors is extremely useful for computations. 

Suppose now that two observers measure the F-field at a point. One has 4- 
velocity yo, and the other is moving at relative velocity v in the yo frame. This 
observer has 4-velocity 


v = RoR, R = exp(at/2), (7.29) 


where v = tanh(a)v. The second observer measures the {7,,} components of 
RFR. To find these we decompose F into terms parallel and perpendicular to 


FHF +F, (7.30) 
where 
vF = Fv, vF =—F\v. (7.31) 


We quickly see that the parallel components are unchanged, but the perpendic- 
ular components transform to 


ŘF, R= exp(—a)F, = y(1 — v) F, (7.32) 
where ¥ is the Lorentz factor (1—v?)~!/?. This result is sufficient to immediately 
establish the transformation law 


E’ = (E +vxB),, 


À (7.33) 
B, = (B = UXE),. 


Here the primed vectors are formed from E’ = E‘o;, for example. These have 
the components of F in the new frame, but combined with the original basis 
vectors. 

Further useful information about the F field is contained in its square, which 
defines a pair of Lorentz-invariant terms. We form 


F? = (FF) +(FF)4 = ao + Iaa, (7.34) 
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which is easily seen to be Lorentz-invariant, 
(RFR)(RFR) = RFFR = ao + Iaa. (7.35) 


Both the scalar and pseudoscalar terms are independent of the frame in which 
they are measured. In the yọ frame these are 


a= ((E + IB)(E + IB)) = E’ — B? (7.36) 
and 
B= —/(I(E + IB)\(E + IB)) =2E-B. (7.37) 


The former yields the Lagrangian density for the electromagnetic field. The 
latter is seen less often. It is perhaps surprising that E-B is a full Lorentz 
invariant, rather than just being invariant under rotations. 


7.1.3 Dielectric and magnetic media 


The Maxwell equations inside a medium, with polarisation and magnetisation 
fields P and M, were given in equation (7.4). These separate into a pair of 
spacetime equations. We introduce the spacetime bivector field G by 


G=D+IH. (7.38) 
Maxwell’s equations are now given by the pair of equations 


VAF =0, 


7.39 
V-G= J. ye?) 


The first tells us that F has vanishing curl, so can still be obtained from a 
vector potential, F = VAA. The second equation tells us how the D and H 
fields respond to the presence of free sources. These equations on their own are 
insufficient to fully describe the behaviour of electromagnetic fields in matter. 
They must be augmented by constitutive relations which relate F and G. The 
simplest examples of these are for linear, isotropic, homogeneous materials, in 
which case the constitutive relations amount to specifying a relative permittivity 
€r and permeability ur. The fields are then related by 


D=.¢,E, B= H. (7.40) 


More complicated models for matter can involve considering responses to differ- 
ent frequencies, and the presence of preferred directions on the material. The 
subject of suitable constitutive relations is one of heuristic model building. We 
are, in effect, seeking models which account for the quantum properties of matter 
in bulk, without facing the full multiparticle quantum equations. 


234 


7.2 INTEGRAL AND CONSERVATION THEOREMS 


7.2 Integral and conservation theorems 


A number of important integral theorems exist in electromagnetism. Indeed, 
the subject of integral calculus was largely shaped by considering applications 
to electromagnetism. Here the results are all derived as examples of the funda- 
mental theorem of integral calculus, derived in chapter 6. 


7.2.1 Static fields 


We start by deriving a number of results for static field configurations. When 
the fields are static the Maxwell equations reduce to the pair 


VE=, VB=pWJ, (7.41) 
€0 
where (for this section) we have reinserted the constants €o and uo. A current J 
is static if the charge flows at a constant rate. The fact that V AE = 0 implies 
that around any closed path 


E-dl=0, (7.42) 
Oo 


which applies for all static configurations. We can therefore introduce a potential 
@ such that 


E=-V¢. (7.43) 


The potential ¢ is the timelike component of the vector potential A, ¢ = yo: A. 
One can formulate many of the main results of electrostatics directly in terms 
of ¢. Here we adopt a different approach and work directly with the E and B 
fields. 

An extremely important integral theorem is a straightforward application of 
Gauss’ law (indeed this is Gauss’ original law) 


1 
E-n|dA| = =f plax|= 2, (7.44) 
OV € Jv €0 


where Q is the enclosed charge. In this formula n is the outward pointing normal, 
formed from dA = In|dA], where dA is the directed measure over the surface, 
and the scalar measure |dX| is simply 


|dX| = dz dy dz. (7.45) 


For the next application, recall from section 6.4.7 the form of the Green’s function 
for the vector derivative in three dimensions, 


(7.46) 
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An application of the fundamental theorem tells us that 
f (GVE+GVE)|dx|= -r GdA E. (7.47) 
V OV 


If we assume that the sources are localised, so that E falls off at large distance, we 
can take the integral over all space and the right-hand side will vanish. Replacing 
G by the Green’s function above we find that the field from a static charge 
distribution is given by 


EGS peor \dX"|. (7.48) 


~ Ameg lr —r’| 


If p is a single 6-function source, p = Qd(r’ — ro), we immediately recover the 
Coulomb field 
Q (r—To) 


= ATEQ |r — roļ’ ` 


E(r) (7.49) 
Unsurprisingly, this is simply a weighted Green’s function. 

For the magnetic field B, the absence of magnetic monopoles is encoded in 
the integral equation 


f B-dA=0. (7.50) 


This tells us that the integral curves of B always form closed loops. This is true 
both inside and outside matter, and holds in the time-dependent case as well. 
Next we apply the integral theorem of equation (7.47) with E replaced by B. If 
we again assume that the fields are produced by localised charges and fall off at 
large distances, we derive 


Ho (r-r) 1 1 
IB(r) = -— | ——J dX’ |. 7.51 
(ry = fdr) lax" (7.51) 
The scalar term in the integrand vanishes as a consequence of the static conser- 
vation law V.J = 0. The bivector term gives the magnetic field bivector IB. 


Now suppose that the current is carried entirely in an ‘ideal’ wire. This is taken 
as an infinitely thin wire carrying a current J, 


J=3 fan WOA 5m — (a) =J | as- y0) (7.52) 


We have little option but to use J for the current as the more standard symbol I is 
already taken for the pseudoscalar. The result is that the B-field is determined 
by a line integral along the wire. This is the Biot-Savart law, which can be 


written 
pod f dl’x(r—r’) 
B(r) = He J — (7.53) 


where r’ is the position vector to the line element dl’. 
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A further integral theorem for magnetic fields is found if we consider the 
integral around a loop enclosing a surface ø. We have 


B= | (BAV)-dA = m | I-(-1aA). (7.54) 


Again, we write dA = In|dA|, where n is the unit right-handed normal. That 
is, if we grip the surface in our right hands in the manner specified by the 
line integral, our thumbs point in the normal direction. The result is that we 
integrate J-n over the surface. This returns the total current through the loop, 
J, recovering Ampére’s law, 


B-dl = uoJ. (7.55) 
ðo 
This is routinely used for finding the magnetic fields surrounding electrical cir- 
cuits. 


7.2.2 Time-varying fields 


If the fields vary in time, some of the preceding formulae remain valid, and 
others only require simple modifications. The two applications of Gauss’ law, 
equations (7.44) and (7.50), remain unchanged. The two applications of Stokes’ 
theorem acquire an additional term. For the E-field we have 


E-dl= $ | UB)da san (7.56) 


ðo dt’ 


where Ẹ is the linked magnetic flux. The flux is the integral of B-n over the area 
enclosed by the loop, with n the unit normal. Magnetic flux is an important 
concept for understanding inductance in circuits. 

For the magnetic field we can derive a similar formula, 


d 
B-dl = poJ + om | E-n|dA|. (7.57) 
Oo o 


This is useful when studying boundary conditions at surfaces of media carrying 
time-varying currents. The equations involving the Euclidean Green’s function 
are no longer valid when the sources vary with time. In section 7.5 we discuss an 
alternative Green’s function suitable for the important case of electromagnetic 
radiation. 


7.2.3 The energy-momentum tensor 


The energy density contained in a vacuum electromagnetic field, measured in 
the yo frame, is 


e= I(E? + B°’), (7.58) 
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where we have reverted to natural units. In section 7.1.2 we saw that the quantity 
E? — B? is Lorentz-invariant. This is not true of the energy density, which should 
clearly depend on the observer performing the measurement. The total energy 
in a volume V is found by integrating £ over the volume. If we look at how this 
varies in time, assuming no sources are present, we find that 


T jax| 3B? +B) = f |\dX|(-—EV(IB) + IBVE) 
V V 


=$ \dA| n (E-(IB)). (7.59) 
OV 


We therefore establish that the field momentum is described by the Poynting 
vector 


P = -E (IB) = ExB. (7.60) 


The energy and momentum should be the components of a spacetime 4-vector 
P, so we form 
P = (e + P) = (E? + B’) + (IBE — EIB) 
= I(E +IB)\(E — IB) 
= 5F(—70F 10)10 = —3 FF. (7.61) 


This quantity is still observer-dependent as it contains a factor of yo. We have in 
fact constructed the energy-momentum tensor of the electromagnetic field. We 
write this as 

T(a) = —}FaF = 1 FoF. (7.62) 


This is clearly a linear function of a and, since it is equal to its own reverse, the 
result is automatically a vector. It is instructive to contrast our neat form of the 
energy-momentum tensor with the tensor formula 


y= 10E FP Fap HFFS Fay (7.63) 


The geometric algebra form of equation (7.62) does a far better job of capturing 
the geometric content of the electromagnetic energy-momentum tensor. 

The energy-momentum tensor T (a) returns the flux of 4-momentum across the 
hypersurface perpendicular to a. This is the relativistic extension of the stress 
tensor, and it is as fundamental to field theory as momentum is to the mechanics 
of point particles. All relativistic fields, classical or quantum, have an associated 
energy-momentum tensor that contains information about the distribution of 
energy in the fields, and acts as a source of gravitation. The electromagnetic 
energy-momentum tensor demonstrates a number of properties that turn out 
to be quite general. The first is that the energy-momentum tensor is (usually) 
symmetric. For example, we have 


a-T(b) = —$(aFbF) = —4 (Fa Fb) = T(a)-b. (7.64) 
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The reason for qualifying the above statement is that quantum spin gives rise to 
an antisymmetric contribution to the (matter) energy-momentum tensor. This 
will be discussed in more details when we look at Dirac theory. 

A second property of the electromagnetic energy-momentum tensor is that the 
energy density v-T(v) is positive for any timelike vector v. This is clear from 
the definition of £ in equation (7.58). The expression for £ is appropriate the yo 
frame, but the sign of £ cannot be altered by transforming to a different frame. 
The reason is that 


(VFUF) = (Ryo RFRRF) = (oF WF’), (7.65) 


where F’ = RFR. Transforming to a different velocity is equivalent to back- 
transforming the fields in the yo frame, so keeps the energy density positive. 
Matter which does not satisfy the inequality v-T(v) > 0 is said to be ‘exotic’, 
and has curious properties when acting as a source of gravitational fields. 

The third main property of energy-momentum tensors is that, in the absence 
of external sources, they give rise to a set of conserved vectors. This is because 


we have 
V-T(a) =0 VY constant a. (7.66) 
Equivalently, we can use the symmetry of T(a) to write 
T(V)-a=0, Va, (7.67) 
which implies that 
T(V) =0. (7.68) 


For the case of electromagnetism, this result is straightforward to prove: 
T(V) = -1[FPVF + FVF] =0, (7.69) 


which follows since VF = FV = 0 in the absence of sources. 
Conservation of the energy-momentum tensor implies that the total flux of 
energy-momentum over a closed hypersurface is zero: 


i: \dA| T(n) =0, (7.70) 
OV 


where OV is a closed 3-surface with directed measure dA = nI |dA|. That the flux 
vanishes is a simple application of the fundamental theorem of integral calculus 
(in flat spacetime), 


i T(n |dA}) =} T(dAI~+) =] T(V) dX I! =0. (7.71) 
OV OV V 
Given that T(yo) is the energy-momentum density in the yo frame, the total 


4-momentum is 


Pia = f XITO). (7.72) 
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Figure 7.1 Hypersurface integration. The integral over a hypersurface of 
a (spacetime) conserved current is independent of the chosen hypersurface. 
The two surfaces Sı and S2 can be joined at spatial infinity (provided the 
fields vanish there). The difference is therefore the integral over a closed 
3-surface, which vanishes by the divergence theorem. 


The conservation equation (7.68) guarantees that, in the absence of charges, the 
total energy-momentum is conserved. We see that 


d 


Pia = f \AX1AT(90) = f AXT), (7.73) 


where we have used the fact that V = yo+ — Wyo. The final integral here is a 
total derivative and so gives rise to a boundary term, which vanishes provided 
the fields fall off sufficiently fast at large distances. Similarly, we can also see 
that Prot is independent of the chosen timelike axis. It is a covariant (non-local) 
property of the field configuration. The proof comes from considering the integral 
over two distinct spacelike hypersurfaces (figure 7.1). If the integrals are joined 
at infinity (which introduces zero contribution) we form a closed integral of T(n). 
This vanishes from the conservation equation, so the total energy-momentum is 
independent of the choice of hypersurface. 

In the presence of additional sources the electromagnetic energy-momentum 
tensor is no longer conserved. The total energy-momentum tensor, including 
both the matter and electromagnetic content will be conserved, however. This is 
a general feature of field theory in a flat spacetime, though the picture is altered 
somewhat if gravitational fields are present. The extent to which the separate 
tensors for each field are not conserved contains useful information about the 
flow of energy-momentum. For example, suppose that an external current is 
present, so that 


T(V) =-k(-JF + FJ) = J-F. (7.74) 


An expression of the form J-F was derived in the Lorentz force law, discussed 
in section 5.5.3. In the yo frame, J-F decomposes into 


J-F = ((p + J) (E + IB))ı = - (J-E + pE + JIxB) 0. (7.75) 
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Surface of constant T 


Figure 7.2 Field from a moving point charge. The charge follows the 
trajectory xo(7T), and X = x — xo(T) is the retarded null vector connecting 
the point x to the worldline. The time 7 can be viewed as a scalar field 
with each value of 7 extended out over the forward null cone. 


The timelike component, J-E, is the work done — the rate of change of energy 
density. The relative vector term is the rate of change of field momentum, and 
so is closely related to the force on a point particle. 


7.3 The electromagnetic field of a point charge 


We now derive a formula for the electromagnetic fields generated by a radiating 
charge. This is one of the most important results in classical electromagnetic 
theory. Suppose that a charge q moves along a worldline xo(T), where 7 is 
the proper time along the worldline (see figure 7.2). An observer at spacetime 
position x receives an electromagnetic influence from the point where the charge’s 
worldline intersects the observer’s past light-cone. The vector 


X = x£-—20(T) (7.76) 


is the separation vector down the light-cone, joining the observer to this inter- 
section point. Since this vector must be null, we can view the equation 


X? =0 (TTT) 


as defining a map from spacetime position x to a value of the particle’s proper 
time 7. That is, for every spacetime position « there is a unique value of the (re- 
tarded) proper time along the charge’s worldline for which the vector connecting 
x to the worldline is null. In this sense, we can write T = T(x), and treat T as a 
scalar field. 
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The Liénard—Wiechert potential for the retarded field from a point charge 
moving with an arbitrary velocity v = Zp is 
q v 


This solution is obtained from the wave equation V? A = J using the appropriate 
retarded Green’s function 
1 
Gret(r,t) = ——0d(|r| — t). 7.79 

ret( ’ ) Ar|r| (| | ) ( ) 
A similar solution exists if the advanced Green’s function is used. The question 
of which is the correct one to use is determined experimentally by the fact that 
no convincing detection of an advanced (acausal) field has ever been reported. 
A deeper understanding of these issues is provided by the quantum treatment of 
radiation. 

If the charge is at rest in the yo frame, we have 


ro(T) = T% = (t — 1) 0, (7.80) 


where r is the relative 3-space distance from the observer to the charge. The 
null vector X is therefore 


X =r(% + er). (7.81) 


For this simple case the 4-potential A is a pure 1/r electrostatic field: 


q ‘Yo q 

A= BN = jar 1 (7.82) 
The same result is obtained if the advanced Green’s function is used. The differ- 
ence between the advanced and retarded solutions is only seen when the charge 
radiates. We know that radiation is not handled satisfactorily in the classical 
theory because it predicts that atoms are not stable and should radiate. Is- 
sues concerning the correct Green’s function cannot be fully resolved without a 
quantum treatment. 


7.3.1 The field strength 
The aim now is to differentiate the potential of equation (7.78) to find the field 
strength. First, we differentiate the equation X? = 0 to obtain 
0 = 74(0,X)-X = V t- X — Vr (O,-a0)-X 
=X —Vr(vu-X). (7.83) 


It follows that 


X 
Vr= < (7.84) 


vU 
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The gradient of 7 points along X, which is the direction of constant 7. This is 
a peculiarity of null surfaces that was first encountered in chapter 6. In finding 
an expression for Vr we have demonstrated how the particle proper time can 
be treated as a spacetime scalar field. Fields of this type are known as adjunct 
fields — they carry information, but do not exist in any physical sense. 

To differentiate A we need an expression for V(X-v). We find that 


V(X-v) = V(X)-v + Vr X-(0;v) 
=v—-Vr+VrX-0, (7.85) 
where ù = 0,v. Provided X is defined in terms of the retarded time, X-v will 


always be positive and there is no need for the modulus in the denominator of 
equation (7.78). We are now in a position to evaluate VA. We find that 


q {Vv 1 
vac An e& py) 
_ 4 ( Xù 1 ari) 
An (Xu) (Xv? (X w)’ 
d Xù Xw- XOX Av 
a oa em 


The result is a pure bivector, so V-A = 0 and the A field of equation (7.78) is 
in the Lorentz gauge. This is to be expected, since the solution is obtained from 
the wave equation V?A = J. 

We can gain some insight into the expression for F by writing 


Xv Xno — Xb X/w = —X(X-(WAv)) = FXdAvX, (7.87) 


which uses the fact that X? = 0. Writing Q, = b/w for the acceleration bivector 
of the particle, we arrive at the compact formula 
q XAv+ 5X,X 


F = 
Ar (X-v)3 


(7.88) 


One can proceed to show that, away from the worldline, F satisfies the free- 
field equation VF = 0. The details are left as an exercise. The solution (7.88) 
2 and a 
long-range radiation term proportional to 1/(distance). The term representing 
the distance is simply X - v. This is just the distance between the events x and 
Xo(T) as measured in the rest frame of the charge at its retarded position. The 
first term in equation (7.88) is the Coulomb field in the rest frame of the charge. 
The second, radiation, term: 


displays a clean split into a velocity term proportional to 1/(distance) 


q SKOGX 


Prada = An f SPN 
1 4n (X wv)’ 


(7.89) 
is proportional to the rest frame acceleration projected down the null vector X. 
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The fact that this term falls of as 1/(distance) implies that the energy-momentum 
tensor contains a term which falls of as the inverse square of distance. This gives 
a non-vanishing surface integral at infinity in equation (7.73) and describes how 
energy is carried away from the source. 


7.8.2 Constant velocity 
A charge with constant velocity v has the trajectory 
xo(T) = v7, (7.90) 
where we have chosen an origin so that the particle passes through this point at 
7 = 0. The intersection of x9(7) with the past light-cone through x is determined 
by 
2 2 2) 1/2 
(z —=vr} =0 =s7r=v-x-((v-a)?—27)". (7.91) 
We have chosen the earlier root to ensure that the intersection lies on the past 
light-cone. We now form X v to find 
X -v = (x — vT) v = (wx)? — 22). (7.92) 
We can write this as |x^v| since 
|cAv|? = x-(v-(aAv)) = (xv)? — 2. (7.93) 
The acceleration bivector vanishes since v is constant, and X/Av = xAuv. It follows 
that the Faraday bivector is simply 
q xv 


F=— , 7.94 
Ar |x Av|3 oe) 


This is the Coulomb field solution with the velocity yo replaced by v. This 
solution could be obtained by transforming the Coulomb field via 


F > F' = RF(ReR)R, (7.95) 


where v = RyR. Covariance of the field equations ensures that this process 
generates a new solution. 

We next decompose F into electric and magnetic fields in the yọ frame. This 
requires the spacetime split 


xv = (yoyo) = Y(t + r)(1 — v))o = y(r — vt) — yr ^v, (7.96) 
where v is the relative velocity and y is the Lorentz factor. We now have 
qay qay 
E = ne (r — vt), B= T IrAv. (7.97) 
Here, the effective distance d can be written 
= (lvlt —v-r/|v|)? +r? — (rv)? v. (7.98) 
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The electric field points towards the actual position of the charge at time t, and 
not its retarded position at time 7. The same is true of the advanced field, hence 
the retarded and advanced solutions are equal for charges with constant velocity. 


7.8.8 Linear acceleration 


Suppose that an accelerating charged particle follows the trajectory 


zo(T) = a(sinh(gT)yo + cosh(gr)¥3), (7.99) 


1 


where a = g™+ (see figure 7.3). The velocity is given by 


u(r) = cosh(gT)yo + sinh(gr)y3 = e977 yo (7.100) 
and the acceleration bivector is simply 
Ùv = go3. (7.101) 


The charge has constant (relativistic) acceleration in the y3 direction. We again 
seek the retarded solution of X? = 0. This is more conveniently expressed in a 
cylindrical polar coordinate system, with 


r = p(cos(¢) 71 + sin(¢) o2) + 203, (7.102) 


so that r? = p? +z”. We then find the following equivalent expressions for the 
retarded proper time: 


1 
er — Ge r? — t? — ((a +r? — £2)? — 40(2? — 2)? 
2a(z — t) (7.103) 
ets : (è r?— P+ (( +r? -PP — 4a? (2? — PCN: 
2a(z + t) 


These equations have a solution provided z +t > 0. As the trajectory assumes 
that the charge has been accelerating for ever, a horizon is formed beyond which 
no effects of the charge are felt (figure 7.3). Constant eternal acceleration of this 
type is unphysical and in practice we only consider the acceleration taking place 
for a short period. 

We can now calculate the radiation from the charge. First we need the effective 
distance 


((a? +r? — £2)? — 4a? (2? — 2) 7 


X-v= 
2a 


(7.104) 


This vanishes on the path of the particle (p = 0 and 2? — t? = a?), as required. 
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Figure 7.3 Constant acceleration. The spacetime trajectory of a particle 
with constant acceleration is a hyperbola. The asymptotes are null vectors 
and define future and past horizons. Any signal sent from within the shaded 
region S will never be received by the particle. 


The remaining factor in F is 


ly; 1 
XAv+5XbvX = r^v — a03 + — (x — £9)03(" — xo) 


2a 
a 
1 z t 
=z- P-P- a)os-Ṣ Lop + P Tog, (7.105) 


where o, and og are the unit spatial axial and azimuthal vectors respectively. 
An instructive way to display the information contained in the expression for F 
is to plot the field lines of E at a fixed time. We assume that the charge starts 
accelerating at t = tı, and stops again at t = t2. There are then discontinuities in 
the electric field line directions on the two appropriate light-spheres. In figure 7.4 
the acceleration takes place for a short period of time, so that a pulse of radiation 
is sent outwards. In figure 7.5 the charge began accelerating from rest at t = 
—10a. The pattern is well developed, and shows clearly the refocusing of the 
field lines onto the ‘image charge’. The image position corresponds to the place 
the charge would have reached had it not started accelerating. Of course, the 
image charge is not actually present, and the field lines diverge after they cross 
the light-sphere corresponding to the start of the acceleration. 

For many applications we are only interested in the fields a long way from the 
source. In this region the fields can usually be approximated by simple dipole 
or higher order multipole fields. Suppose that the charge accelerates for a short 
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Figure 7.4 Field lines from an accelerated charge I. The charge accelerated 
for —0.2a < t < 0.2a, leaving an outgoing pulse of transverse radiation field. 
The field lines were computed at t = 5a. 


period and emits a pulse of radiation. In the limit r >> a the pulse will arrive 
at some time which, to a good approximation, is centred around the time that 
minimises X-v. This time is given by 


to = Vr? -a?. (7.106) 


At t = to the proper distance X -v evaluates to p, the distance from the z axis. 
The point on the axis p away from the observer is where the charge would appear 
to be if it were not accelerating. For the large distance approximation to be valid 
we therefore also require that p is large, so that the proper distance from the 
source is large. (For small p and z > a a different procedure can be used.) We 
can now obtain an approximate formula for the radiation field at a fixed location 
r, with r, p > a, around t = to. For this we define 


by = t — to (7.107) 
so that the proper distance is approximated by 


1/2 


X-v (P + 7757/07) (7.108) 
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Figure 7.5 Field lines from an accelerated charge II. The charge began its 
acceleration at tı = —10a and has thereafter accelerated uniformly. The 
field lines are plotted at t = 3a. 


The remaining terms in F become 


X Av+1Xbvx ~ (o+ Io), (7.109) 
a 
where og and og are unit spherical-polar basis vectors. The final formula is 
252\ 73/2 
arp (2, 7°45 
F x —— I .11 
a (+ + z2 ) (oo +Io¢), (7.110) 


which describes a pure, outgoing radiation field a large distance from a linearly 
accelerating source. The magnitude of the acceleration is controlled by g = a7}. 


7.3.4 Circular orbits and synchrotron radiation 


As a further application, consider a charge moving in a circular orbit. The 
worldline is defined by 


zo = T cosh(a) yo + a(cos(wr)y + sin(wr)y2), (7.111) 
where a =w~'sinh(a). The particle velocity is 


v = cosh(a) y + sinh(a) (—sin(wr)y + cos(wT)y2) = RyoR, (7.112) 
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Figure 7.6 Field lines from a rotating charge I. The charge has a = 0.1, 
which gives rise to a smooth, wavy pattern. 


where the rotor R is given by 
Rae 8793/2900 2/2. (7.113) 


We must first locate the retarded null vector X. The equation X? = 0 reduces 
to 


t = rcosh(a) + (r? + a? — 2apcos(wr — p), (7.114) 


which is an implicit equation for r(x). No simple analytic solution exists, but 
a numerical solution is easy to achieve. This is aided by the observation that, 
for fixed r, the mapping between t and T is monotonic and 7 is bounded by the 
conditions 


1/ 


t— (r? + 2ap + a°) â < Tcosh(a) < t— (r? — 2ap + ey (7.115) 


Once we have a satisfactory procedure for locating 7 on the retarded light-cone, 
we can straightforwardly employ the formula for F in numerical simulations. The 
first term required is the effective distance X - v, which is given by 


1/ 


Xv = cosh(a) (r? + a? — 2apcos(wr — ¢)) “i psinh(a) sin(wr — ġ). (7.116) 


The remaining term to compute, X Av + XùvX/2, is more complicated, as can 
be seen from the behaviour shown in figures 7.6, 7.7 and 7.8. They show the 
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Figure 7.7 Field lines from a rotating charge II. The charge has an in- 
termediate velocity, with a = 0.4. Bunching of the field lines is clearly 
visible. 


field lines in the equatorial plane of a rotating charge with w = 1. For ‘low’ 
speeds we get the gentle, wavy pattern of field lines shown in figure 7.6. The 
case displayed in figure 7.7 is for an intermediate velocity (a = 0.4), and displays 
many interesting features. By a = 1 (figure 7.8) the field lines have concentrated 
into synchrotron pulses, a pattern which continues thereafter. 

Synchrotron radiation is important in many areas of physics, from particle 
physics through to radioastronomy. Synchrotron radiation from a radiogalaxy, 
for example, has a ~ 108 m and r ~ 107° m. A power-series expansion in a/r 
is therefore quite safe! Typical values of cosh(a) are 104 for electrons producing 
radio emission. In the limit r > a, the relation between t and 7 simplifies to 


t—r x rcosh(a@) — asin(@) cos(wr — ¢). (7.117) 
The effective distance reduces to 
Xv ~ rcosh(a) (1+ tanh(q) sin(@) sin(wr — ¢)), (7.118) 
and the null vector X given by the simple expression 
X = r(yo + er). (7.119) 


In the expression for F of equation (7.88) we can ignore the X A v (Coulomb) 
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Figure 7.8 Field lines from a rotating charge III. The charge is moving at 
a highly relativistic velocity, with a = 1. The field lines are concentrated 
into a series of synchrotron pulses. 


term, which is negligible compared with the long-range radiation term. For the 
radiation term we need the acceleration bivector 
bv = —w sinh(a) cosh(a) (cos(wr)o, + sin(wT)o2) + w sinh?(a)Io3. (7.120) 


The radiation term is governed by XQ.,X/2, which simplifies to 


3XbtvX x wr? cosh(a) sinh(a) (cos() cos(wr — ¢)o9(1 — or) 


+ wr? sinh(a) (cosh(q) sin(wr — ¢) + sinh(a) sin(@))og(1—o,). (7.121) 


These formulae are sufficient to initiate studying synchrotron radiation. They 
contain a wealth of physical information, but a detailed study is beyond the 
scope of this book. 


7.4 Electromagnetic waves 


For many problems in electromagnetic theory it is standard practice to adopt a 
complex representation of the electromagnetic field, with the implicit assump- 
tion that only the real part represents the physical field. This is particularly 
convenient when discussing electromagnetic waves and diffraction, as studied in 
this and the following section. We have seen, however, that the field strength 
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F is equipped with a natural complex structure through the pseudoscalar I. 
We should therefore not be surprised to find that, in certain cases, the for- 
mal imaginary i plays the role of the pseudoscalar. This is indeed the case for 
circularly-polarised light. But one cannot always identify ¿ with I, as is clear 
when handling plane-polarised light. The formal complexification retains its use- 
fulness in such applications and we accordingly adopt it here. It is important 
to remember that this is a formal exercise, and that real parts must be taken 
before forming bilinear objects such as the energy-momentum tensor. The study 
of electromagnetic waves is an old and well-developed subject. Unfortunately, it 
suffers from the lack of a single, universal set of conventions. As far as possible, 
we have followed the conventions of Jackson (1999). 

We seek vacuum solutions to the Maxwell equations which are purely oscilla- 
tory. We therefore start by writing 


F = Re(Foe **). (7.122) 
The vacuum equation VF = 0 then reduces to the algebraic equation 
kFo = 0. (7.123) 


Pre-multiplying by k we immediately see that k? = 0, as expected of the wavevec- 
tor. The constant bivector Fo must contain a factor of k, as nothing else totally 
annihilates k. We therefore must have 


Fo =kAn=kn, (7.124) 
where n is some vector satisfying k-n = 0. We can always add a further multiple 
of k to n, since 

k(n + Ak) = kn + AK? = kAn. (7.125) 

This freedom in n can be employed to ensure that n is perpendicular to the 
velocity vector of some chosen observer. 

As an example, consider a wave travelling in the y3 direction with frequency 


w as measured in the yo frame. This implies that yo-k = w, so the wavevector is 
given by 


k= ult y), (7.126 
and the phase term is 
—ik-x = —iw(t — z). (7.127) 


The vector n can be chosen to just contain yı and y2 components, so we can 
write 


F = — (y0 + 73)(a171 + a272) cos(k- x) 
= (1 + 03)(a101 + &202) cos(k- x). (7.128) 
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This solution represents plane-polarised light, as both the E and B fields lie in 
fixed planes, 90° apart, and only their magnitudes oscillate in time. 

An arbitrary phase can be added to the cosine term, so the most general 
solution for a wave travelling in the +z direction is 


F= (1 + a3) ((a101 + Q202) cos(k-x) + (Bi01 + b202) sin(k-x)), (7.129) 


where the constants a; and ĝ;, are all real. This general solution can describe 
all possible states of polarisation. A convenient representation is to introduce 
the complex coefficients 


Cı = Q1 + ibi, Cg = Q2 + iba. (7.130) 


These form the components of the complex Jones vector (c1,c2). In terms of 
these components we can write 


F = Re((1+03)(c191 + c202)e 7t ®), (7.131) 


and it is a straightforward matter to read off the separate E and B fields. 
The multivector (1 + o3) has a number of interesting properties. It absorbs 
factors of o3, as can be seen from 


o3(1 G 03) = 1 +03. (7.132) 


In addition, (1 + 73) squares to give a multiple of itself, 


(1+ 03)? = 1 +203 +03 = 2(1 + 03). (7.133) 


This property implies that (1+ 03) does not have an inverse, so in a multivector 
expression it acts as a projection operator. The combination (1 + 03)/2 has the 
particular property of squaring to give itself back again. Multivectors with this 
property are said to be idempotent and are important in the general classification 
of Clifford algebras and their spinor representations. In spacetime applications 
idempotents invariably originate from a null vector, in the manner that (1+ 03) 
originates from a spacetime split of yo + 73. 


7.4.1 Circularly-polarised light 


Many problems are more naturally studied using a basis of circularly-polarised 
states, as opposed to plane-polarised ones. These arise when cı and c2 are 1/2 
out of phase. One form is given by ay = — b2 = Ep and az = bı = 0, where Eo 
denotes the magnitude of the electric field. For this solution we can write 
F = Eo(1 + o3) (01 cos(k-x) — o2 sin(k-2)) 
= Eg(1 + a3) aye 7 et- 7), (7.134) 


In a plane of constant z (a wavefront) the E field rotates in a clockwise (negative) 
sense, when viewed looking back towards the source (figure 7.9). In the optics 
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| E- Egoe 732l- 2) 


Figure 7.9 Right-circularly-polarised light. In the z = 0 plane the E vector 
rotates clockwise, when viewed from above. The wave vector points out of 
the page. In space, at constant time, the E field sweeps out a right-handed 
helix. 


literature this is known as right-circularly-polarised light. The reason for this is 
that, at constant time, the E field sweeps out a helix in space which defines a 
right-handed screw. If you grip the helix in your right hand, your thumb points 
in the direction in which the helix advances if tracked along in the sense defined 
by your grip. This definition of handedness for a helix is independent of which 
way round you chose to grip it. 

Left-circularly-polarised light has the E field rotating with the opposite sense. 
The general form of this solution is 


II 


F = (1 + o3) (a101 + a202) E, (7.135) 
Particle physicists prefer an alternative labelling scheme for circularly-polarised 
light. The scheme is based, in part, on the quantum definition of angular mo- 
mentum. In the quantum theory, the total angular momentum consists of a 
spatial part and a spin component. Photons, the quanta of electromagnetic ra- 
diation, have spin-1. The spin vector for these can either point in the direction 
of propagation, or against it, depending on the orientation of rotation of the E 
field. It turns out that for right-circularly-polarised light the spin vector points 
against the direction of propagation, which is referred to as a state of negative 
helicity. Conversely, left-circularly-polarised light has positive helicity. 
Equation (7.132) enables us to convert phase rotations with the bivector Io 
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into duality rotations governed by the pseudoscalar J. This relies on the relation 


(1+ o3)e!73? =(1+ a3) (cos(¢) + Ios sin(¢)) 
= (1+ 63)(cos(¢) + Isin(¢)) = (1 + o3)e!?. (7.136) 


The general solution for right-circularly-polarised light can now be written 


F = (1 + 03) P3k 2 (ay01 + a202) 


= (1 + 03) (a101 + a202)e!**, (7.137) 


In this case the complex structure is now entirely geometric, generated by the 
pseudoscalar. This means that there is no longer any need to take the real part 
of the solution, as the bivector is already entirely real. A similar trick can be 
applied to write the constant terms as 


(1 + 03)(a101 + a202) = (1+ 03)o01(a1 — Ia), (7.138) 


so that the coefficient also becomes ‘complex’ on the pseudoscalar. The general 
form for right-hand circularly-polarised light solution can now be written 


F = (1+o3)o,apel**, (7.139) 


where ap is a scalar + pseudoscalar combination. Left-hand circularly-polarised 
light is described by reversing the sign of the exponent to —Ik-x. General 
polarisation states can be built up as linear combinations of these circularly 
polarised modes, so we can write 


F =(1+os3)o1(are!** + aret?) (7.140) 


Here both the coefficients œz and ap are scalar + pseudoscalar combinations. 
The complexification is now based on the pseudoscalar, and we can use ag and 
ay as alternative, geometrically meaningful, complex coefficients for describing 
general polarisation states. For completeness, the az and ar parameters are 
related to the earlier plane-polarised coefficients a; and 8; by 


ar = 5(a1 — b2) + $ (a2 + 61 )I, 
aL = (a1 + b2) (az Bı )I. 


The preceding solutions all assume that the wave vector is entirely in the o3 


(7.141) 


direction. More generally, we can introduce a right-handed coordinate frame 
{e;}, with e3 pointing along the direction of propagation. The solutions then all 
generalise straightforwardly. In more covariant notation the circularly-polarised 
modes can also be written 


F = kn(age!*® + ape /**), (7.142) 


where k-n = 0. 
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7.4.2 Stokes parameters 


A useful way of describing the state of polarisation in light emitted from some 
source is through the Stokes parameters. The general definition of these involves 
time averages of the fields, which we denote here with an overbar. To start with 
we assume that the light is coherent, so that all modes are in the same state. 
We first define the Stokes parameters in terms of the plane-polarised coefficients. 
The electric field is given by 


E = Re((c191 + c202)e™™®) = Re(E), (7.143) 


where € denotes the complex amplitude. The first Stokes parameter gives the 
magnitude of the electric field, 


so = 2E? = (E&"), (7.144) 
where the star denotes complex conjugation. This evaluates straightforwardly 
to 

_|,.]2 2 

So = |c] + \c2| : (7.145) 

The remaining three Stokes parameters describe the relative amounts of radiation 


present in various polarisation states. If we denote the real components of E by 
Ez; and E, the parameters are defined by 


81 = 2(E2 — E2) = |a|? — |ea/? 
s2 = 4E, E} = 2Re(ci¢3) (7.146) 
s3 = 4E,(t)E,(t + 1/(2w)) = —2Im(c1 c3). 


The Stokes parameters can equally well be written in terms of the az and arg 
coefficients of circularly-polarised modes: 


80 = 2(laz|? + |r|”), 


8, = 4(aypar), 


(7.147) 
S2 = —4(Iaran), 
s3 = 2(laz| — larl’). 
For coherent light the Stokes parameters are related by 
sg = 83 + 82 +83. (7.148) 


The s,, can therefore be viewed algebraically as the components of a null vector, 
though its direction in space has no physical significance. This representation for 
‘observables’ in terms of a null vector is typical of a two-state quantum system. 
We can bring this out neatly in the spacetime algebra by introducing the three- 
dimensional rotor 


k = (az) + Iar) os — (ar o — (Taryo. (7.149) 
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The (quantum) origin of this object is explained in section 8.1. The rotor « 
satisfies 
kki = $80; kosk! = tsio. (7.150) 
It follows that in spacetime 
26(7o +73) % = 2K(1 + o3)K! Yo = 5070 + SiYi, (7.151) 


and since we have rotated a null vector we automatically obtain a null vector. 


The unit spatial vector 


$=, s=5,0; (7.152) 
50 

can be represented by a point on a sphere. For light polarisation states this 
is called the Poincaré sphere. For spin-1/2 systems the equivalent construction 
is known as the Bloch sphere. The construction is also useful for describing 
partially coherent light. In this case the light can be viewed as originating from 
a set of discrete (incoherent) sources. The single null vector is replaced by an 
average over the sources, 


s= Sos (7.153) 
k=1 


and the unit vector $ is replaced by 
sy Tw 
0 y k z 
Ss = — —S ; = m 7.154 
= k W y Wk ( ) 


The resulting polarisation vector s has s? < 1, so now defines a vector inside the 
Poincaré sphere. The length of this vector directly encodes the relative amounts 
of coherent and incoherent light present. 

The preceding discussion also makes it a simple matter to compute how the 


Stokes parameters appear to observers moving at different velocities. Suppose 
that a second observer with velocity v = eo sets up a frame {e,,}. This is done in 
such a way that the wave vector still travels in the e3 direction, which requires 
that 


k— k-vv 
o e .1 
e3 kw (7.155) 
If the old and new frames are related by a rotor, e, = Ry, R, then equa- 
tion (7.155) restricts R to satisfy 
RkR = Ak. (7.156) 


Rather than work in the new frame, it is simpler to back-transform the field F 
and work in the original {y„} frame. We define 


3 es 
F’ = RF(ReR)R = Sho! (ane! + ape Te 2/) (7.157) 
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where n’ = RnR and k = w(yo+73). We can again choose n’ to be perpendicular 
to yo by adding an appropriate multiple of k. It follows that the only change to 
the final vector n can be a rotation in the Jo3 plane. Performing a spacetime 
split on yo, and assuming that the original n was —71, we obtain 


1 
F = 5(1 + oa)ore™ #79 (ane) + are), (7.158) 


where ¢ is the angle of rotation in the Jo3 plane. The rotation can again be 
converted to a phase factor on J, so the overall change is that ag and az are 
multiplied by \~'exp(I¢). The rescaling has no effect on the unit vector on 
the Poincaré sphere, so the only change is a rotation through 2¢ in the Io3 
plane. This implies that the a3 component of the vector on the Poincaré sphere 
is constant, which is sensible. This component determines the relative amounts 
of left and right-circularly-polarised light present, and this ratio is independent 
of which observer measures it. Similar arguments apply to the case of partially 
coherent light. 


7.5 Scattering and diffraction 


We turn now to the related subjects of the scattering and diffraction of electro- 
magnetic waves. This is an enormous subject and our aim here is to provide 
little more than an introduction, highlighting in particular a unified approach 
based on the free-space multivector Green’s function. This provides a first-order 
formulation of the scattering problem, which is valuable in numerical compu- 
tation. We continue to adopt a complex representation for the electromagnetic 
field, and will concentrate on waves of a single frequency. The time dependence 
is then expressed via 


F(a) = F(r)e™, (7.159) 
so that the Maxwell equations reduce to 
VF -iwk =0. (7.160) 


This is the first-order equivalent of the vector Helmholtz equation. Throughout 
this section we work with the full, complex quantities, and suppress all factors 
of exp(iwt). All quadratic quantities are assumed to be time averaged. 

If sources are present the Maxwell equations become 


(V —-iw)F=p-J. (7.161) 

Current conservation tells us that the (complex) current satisfies 
iwp = V.J. (7.162) 
Provided that all the sources are localised in some region in space, there can be 
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no electric monopole term present. This follows because 


Q= I ldX|p = af Tn laAl (7.163) 


where n is the outward normal. Taking the surface to totally enclose the sources, 
so that J vanishes over the surface of integration, we see that Q = 0. 


7.5.1 First-order Green’s function 


The main result we employ in this section is Green’s theorem in three dimensions 
in the general form 


[@vF+evrylax| -$ GnF dA (7.164) 
V OV 


where n is the outward-pointing normal vector over the surface OV. If F satisfies 
the vacuum Maxwell equations, we have 


f GnFdA= | (GY + iwG)F |AX]. (7.165) 
OV V 


We therefore seek a Green’s function satisfying 

GV + iwG = ô(r). (7.166) 
It will turn out that G only contains (complex) scalar and vector terms, so (by 
reversing both sides) this equation is equivalent to 

(V +iw)G = d(r). (7.167) 


The Green’s function is easily found from the Green’s function for the (scalar) 
Helmholtz equation, 


l 
olr) = —— e". (7.168) 
Arr 
This is appropriate for outgoing radiation. Choosing the outgoing Green’s func- 
tion is equivalent to imposing causality by working with retarded fields. The 


function ¢ satisfies 
(V? +w")6 = 6(r) = (V + iw)(V — iw)d. (7.169) 
We therefore see that the required first-order Green’s function is 


G(r) = (V — iw)ọ 


E (“a oy) + 5) (7.170) 


4r r r3 


where o, = r/r is the unit vector in the direction of r. This Green’s function is 
the key to much of scattering theory. With a general argument it satisfies 


(V +iw)G(r — r’) = ôr- r’) (7.171) 
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or, equivalently, 
(V — iw)G(r —r’') = —6(r — r’), (7.172) 


where V’ denotes the vector derivative with respect to r’. 


7.5.2 Radiation and multipole fields 


As a first application, suppose that a localised system of charges in free space, 
with sinusoidal time dependence, generates outgoing radiation fields. We could 
find these by generalising our point source solutions of section 7.3, but here we 
wish to exploit our new Green’s function. We can now immediately write down 
the solution 


Re = I G(r’ — r) (p(r’) — I(r’) |aX"|, (7.173) 


where the integral is over a volume enclosing all of the sources. Equation (7.172) 
guarantees that this equation solves the Maxwell equations (7.161), subject to 
the boundary condition that only outgoing waves are present at large distances. 
It is worth stressing that the geometric algebra formulation is crucial to the way 
we have a single integral yielding both the electric and magnetic fields. 

Often, one is mainly interested in the radiation fields present at large distances 
from the source. These are the contributions to F which fall off as 1/r. To isolate 
these terms we use the expansion 


eilr — sA a eiTe iwar: r’ $ O(r7}), (7.174) 


so that the Green’s function satisfies 


lim G(r’ — r) = rae Hoper, (7.175) 


roo TT 
We therefore find that the limiting form of F can be written 
F(r)= s +0,) / e Wart! (ofr) — J(r')) |dX"|. (7.176) 
Ur 

As expected, the multivector is controlled by the idempotent term (1 + o+) = 
(yo + er)yo, appropriate for outgoing radiation. 

A multipole expansion of the radiation field is achieved by expanding (7.176) 
in a series in wd, where d is the dimension of the source. To leading order, and 
recalling that no monopole term is present, we find that 


fee (ole! — J(r’)) |dX"| = fo — iwpo,-r’)|dX"| 


= [trend |dX"|, (7.177) 
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Figure 7.10 Scattering by a localised object. The incident field F; sets up 
oscillating currents in the object, which generate an outgoing radiation field 
Ez: 


where we have integrated by parts to obtain the final expression. This result is 
more commonly expressed in terms of the electric dipole moment p, via 


[max = - f rY- ax] = -iw f rp(e) |dX| = —iwp. (7.178) 


The result is that the F field is given by 


2 


F(r) = — e" (1 + 0,)(p -or p). (7.179) 
4rr 

An immediate check is that the scalar term in F vanishes, as it must. The 

electric and magnetic dipole fields can be read off easily now as 


2 2 


E = Y iTo, Or Np, IB = 2 Te Ap. (7.180) 
Arr Arr 


These formulae are quite general for any (classical) radiating object. 


7.6 Scattering 


The geometry of a basic scattering problem is illustrated in figure 7.10. A 
(known) field F; is incident on a localised object. Usually the incident radia- 
tion is taken to be a plane wave. This radiation sets up oscillating currents in 
the scatterer, which in turn generate a scattered field Fs. The total field F is 
given by 

F = F; + Fs, (7.181) 


and both F; and F, satisfy the vacuum Maxwell equations away from the scat- 
terers. 
The essential difficulty is how to solve for the currents set up by the incident 
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radiation. This is extremely complex and a number of distinct approaches are 
described in the literature. One straightforward result is for scattering from a 
small uniform dielectric sphere. For this situation we have 


€r — 1 


Ei, 182 
248 (7.182) 


p = 4ra? 


where a is the radius of the sphere. From equation (7.180) we see that the ratio 
of incident to scattered radiation is controlled by w?. This ratio determines the 
differential cross section via 


do __,|e*-E,|? 


z 7.183 
dQ" Je EP (ER 


where the complex vector e determines the polarisation. The cross section clearly 
depends of the polarisation of the incident wave. Summing over polarisations 
the differential cross section is 


do 4.6 [€r —1\° 1+ cos?(0) 
zup ; 184 
read (==) 2 ce 


The factor of wt = A~ is typical of Rayleigh scattering. These results are central 
to Rayleigh’s explanation of blue skies and red sunsets. 

Suppose now that we know the fields over a closed surface enclosing a volume 
V. Provided that F satisfies the vacuum Maxwell equations throughout V we 
can compute F directly from 


F,(r') = G(r — r')nF,(r) |ds]. (7.185) 
OV 

We take the volume V to be bounded by two surfaces, Sı and S2, as shown in 
figure 7.11. The surface Sı is assumed to lie just outside the scatterers, so that 
J = 0 over Sı. The surface S2 is assumed to be spherical, and is taken out to 
infinity. In this limit only the 1/r terms in G and F can contribute to the surface 

integral over S2. But from equation (7.175) we know that 
iw 


lim G(r — r’) = ——e" (1 — oer", (7.186) 


r= oo Anr 


whereas F, contains a factor of (1 + øp). It follows that the integrand GnF, 
contains the term 


(1—o,)o,(1+o,) = 0. (7.187) 


This is identically zero, so there is no contribution from the surface at infinity. 
The result is that the scattered field is given by 


1 l $ p = t _ £ 
F,(r) = — ¢ d E + Lin ThE z ) n'F,(r')|dS(r’)|, (7.188) 


262 


7.6 SCATTERING 


Figure 7.11 Surfaces for Green’s theorem. The surface S2 can be taken 
out to infinity, and Sı lies just outside the scattering surface. 


where 
d= |r—r'|. (7.189) 


Since n is the outward pointing normal to the volume, this points into the scat- 
terers. This result contains all the necessary polarisation and obliquity factors, 
often derived at great length in standard optics texts. 

A significant advantage of this first-order approach is that it clearly embodies 
Huygens’ principle. The scattered field F, is propagated into the interior simply 
by multiplying it by a Green’s function. This accords with Huygen’s original idea 
of reradiation of wavelets from any given wavefront. Two significant problems 
remain, however. The first is how to specify F, over the surface of integration. 
This requires detailed modelling of the polarisation currents set up by the in- 
cident radiation. A subtlety here is that we do not have complete freedom to 
specify F over the surface. The equation VF = iwF implies that the compo- 
nents of E and B perpendicular to the boundary surface are determined by the 
derivatives of the components in the surface. This reduces the number of degrees 
of freedom in the problem from six to four, as is required for electromagnetism. 

A further problem is that, even if F, has been found, the integrals in equa- 
tion (7.188) cannot be performed analytically. One can approximate to the 
large r regime and, after various approximations, recover Fraunhofer and Fres- 
nel optics. Alternatively, equation (7.188) can be used as the basis for numerical 
simulations of scattered fields. Figure 7.12 shows the type of detailed patterns 
that can emerge. The plot was calculated using the two-dimensional equivalent 
of equation (7.188). The total energy density is shown, where the scattering 
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Figure 7.12 Scattering in two dimensions. The plots show the intensity 
of the electric field, with higher intensity coloured lighter. The incident 
radiation enters from the bottom right of the diagram and scatters off a 
conductor with complicated surface features. The conductor is closed in 
the shadow region. Various diffraction effects are clearly visible. The right- 
hand plot is a close-up near the surface and shows the complicated pattern 
of hot and cold regions that can develop. 


is performed by a series of perfect conductors. A good check that the calcula- 
tions have been performed correctly is that all the expected shadowing effects 
are present. 


7.7 Notes 


There is a vast literature on electromagnetism and electrodynamics. For this 
chapter we particularly made use of the classic texts by Jackson (1999) and 
Schwinger et al. (1998), both entitled Classical Electrodynamics. The former 
of these also contains an exhaustive list of further references. Applications of 
geometric algebra to electromagnetism are discussed in the book Multivectors 
and Clifford Algebra in Electrodynamics by Jancewicz (1989). This is largely an 
introductory text and stops short of tackling the more advanced applications. 

We are grateful to Stephen Gull for producing the figures in section 7.3 and for 
stimulating much of the work described in this chapter. Further material can be 
found in the Banff series of lectures by Doran et al (1996a). Readers interested 
in the action at a distance formalism of Wheeler and Feynman can do no better 
than return to their original 1949 paper. It is a good exercise to convert their 
arguments into a more streamlined geometric algebra notation! 
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7.1 


7.2 


7.3 


7.4 


7.8 Exercises 


A circular current loop has radius a and lies in the z = 0 plane with 
its centre at the origin. The loop carries a current J. Write down an 
integral expression for the B field, and show that on the z axis, 


bo Ja? 


Bir 2(a2 + z2)3/2 oe 


An extension to the Maxwell equations which is regularly discussed is 
how they are modified in the presence of magnetic monopoles. If pm and 
Jm denote magnetic charges and currents, the relevant equations are 


V-D = pe, V-B = pm, 


(a) o 
— E = — B ms H=—D e 
Vx 3k + J Vx J +J 


Prove that in free space these can be written 
VEF = Je + Jml, 


where Jm = (Pm + Jm)Yo. A duality transformation of the E and B 
fields is defined by 


E' = E cos(a) + B sin(a), B' = B cos(a) — B sin(a). 


Prove that this can be written compactly as F’ = Fe™7®, Hence find 
the equivalent transformation law for the source terms such that the 
equations remain invariant, and prove that the electromagnetic energy- 
momentum tensor is also invariant under a duality transformation. 

A particle follows the trajectory xzo(T), with velocity v = ¢ and acceler- 
ation ù. If X is the retarded null vector connecting the point x to the 
worldline, show that the electromagnetic field at x is given by 


q XAv+ iXX 


F = 
4r (X-v)3 i 


where Q, = vAv. Prove directly that F satisfies VF = 0 off the particle 
worldline. 


Prove the following formulae relating the retarded A and F fields for a 
point charge to the null vector X: 


A=- 2 VX, F=- | eX 
8BTEo 8BTEo 


These expressions are of interest in the ‘action at a distance’ formulation 
of electrodynamics, as discussed by Wheeler and Feynman (1949). 
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7.5 Confirm that, at large distances for the source, the radiation fields due 
to both linearly and circularly accelerating charges go as 


1 
Fraa x -(1 +o,)a, 
r 


where o,-a = 0. 

7.6 From the solution for the fields due to a point charge in a circular orbit 
(section 7.3.4), explain why synchrotron radiation arrives in pulses. 

7.7 For the « defined in equation (7.149), verify that ko3«! = sioi, where 
s; are Stokes parameters. 

7.8 A rotor R relates two frames by e, = Re, R. In both frames the vector 
e3 vector is defined by 


1 k — k-e €o 
63° e oe 


r 


k-e9 
where k is a fixed null vector. Prove that for this relation to be valid for 
both frames we must have 


RkR = Ak. 


How many degrees of freedom are left in the rotor R if this equation 
holds? 

7.9 In optical problems we are regularly interested in the effects of a planar 
aperture on incident plane waves. Suppose that the aperture lies in the 
z = 0 plane, and we are interested in the fields in the region z > 0. By 
introducing the Green’s function 


G'(r;r') = G(r —r')— G(r -7"'), 


where T = —o3Pro03, prove that the field in the region z > 0 is given by 
i z'ęlwd 


where d = |r — r'|. In the Kirchoff approximation we assume that Fs 
over the aperture can be taken as the incident plane wave. By working 


in the large r and small angle limit, prove the Fraunhofer result that 


j 


the transmitted amplitude is controlled by the Fourier transform of the 
aperture function. 

7.10 Repeat the analysis of the previous question for a two-dimensional arrange- 
ment. You will need to understand some of the properties of Hankel 
functions. 
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Quantum theory and spinors 


In this chapter we study the application of geometric algebra to both non- 
relativistic and relativistic quantum mechanics. We concentrate on the quan- 
tum theory of spin-1/2 particles, whose dynamics is described by the Pauli and 
Dirac equations. For interactions where spin and relativity are not important 
the dynamics reduces to that of the Schrödinger equation. There are many good 
textbooks describing this topic and we will make no attempt to cover it here. We 
assume, furthermore, that most readers have a basic understanding of quantum 
mechanics, and are familiar with the concepts of states and operators. 

Both the Pauli and Dirac matrices arise naturally as representations of the 
geometric algebras of space and spacetime. It is no surprise, then, that much of 
quantum theory finds a natural expression within geometric algebra. To achieve 
this, however, one must reconsider the standard interpretation of the quantum 
spin operators. Like much discussion of the interpretation of quantum theory, 
certain issues raised here are controversial. There is no question about the va- 
lidity of our algebraic approach, however, and little doubt about its advantages. 
Whether the algebraic simplifications obtained here are indicative of a deeper 
structure embedded in quantum mechanics is an open question. 

In this chapter we only consider the quantum theory of single particles in 
background fields. Multiparticle systems are considered in the following chapter. 
Amongst the results discussed in this section are the angular separation of the 
Dirac equation, and a method of calculating cross sections that avoids the need 
for spin sums. Both of these results are used in chapter 14 for studying the 
behaviour of fermions in gravitational backgrounds. 


8.1 Non-relativistic quantum spin 


The Stern—Gerlach experiment was the first to demonstrate the quantum nature 
of the magnetic moment. In this experiment a beam of particles passes through 
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Spin up 


Spin down 


Figure 8.1 The Stern—Gerlach experiment. A particle beam is sent 
through a highly non-uniform B field. What emerges is a set of discrete, 
evenly-spaced beams. 


a non-uniform magnetic field B. Classically, one would expect the force on each 
particle to be governed by the equation 


f=u-VB, (8.1) 


where p is the magnetic moment. This would give rise to a continuous distrib- 
ution after passing through the field. Instead, what is observed is a number of 
evenly-spaced discrete bands (figure 8.1). The magnetic moment is quantised in 
the same manner as angular momentum. 

When silver atoms are used to make up the beam there is a further surprise: 
only two beams emerge on the far side. Silver atoms contain a single electron 
in their outermost shell, so it looks as if electrons have an intrinsic angular 
momentum which can take only two values. This is known as its spin, though no 
classical picture should be inferred from this name. The double-valued nature 
of the spin suggests that the electron’s wavefunction should contain two terms, 
representing a superposition of the possible spin states, 


ly) = a] 1) + 2] 4), (8.2) 


where a and 8 are complex numbers. Such a state can be represented in matrix 


form as the spinor 
w= (4). (8.3) 


If we align the z axis with the spin-up direction, then the operator returning the 


gai G 2) (8.4) 


where A is to be determined. The spin is added to the orbital angular momentum 


spin along the z axis must be 
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to give a conserved total angular momentum operator 7 = 1+ 8. For this to 
make sense the spin operators should have the same commutation relations as 
the angular momentum operators li, 


li = — thei jt; Ok, (ii, i] = ineijnle- (8.5) 
This is sufficient to specify the remaining operators, up to an arbitrary phase 
(see exercise 8.1). The result is that the spin operators are given by 

ŝk = shen, (8.6) 


where the Gy are the familiar Pauli matrices 


ôi = (; ie 62 = (; for 63 = l =) ; (8.7) 


The ‘hat’ notation is used to record the fact that these are viewed explicitly as 
matrix operators, rather than as elements of a geometric algebra. The Pauli 
matrices satisfy the commutation relations, 


[63,65] = 2icijkõk. (8.8) 
They also have the property that two different matrices anticommute, 
6162+ 696, =0, ete. (8.9) 
and all of the matrices square to the identity matrix, 
ô? =ô = 6,2 I: (8.10) 


These are precisely the relations obeyed by a set of orthonormal vectors in space. 
We denote such a set by {o;}. The crucial distinction is that the Pauli matrices 
are operators in quantum isospace, whereas the {0p} are vectors in real space. 

The 6, operators act on two-component complex spinors as described in equa- 
tion (8.3). Spinors belong to two-dimensional complex vector space, so have four 
real degrees of freedom. A natural question to ask is whether an equivalent 
representation can be found in terms of real multivectors, such that the matrix 
action is replaced by multiplication by the {0p} vectors. To find a natural way 
to do this we consider the observables of a spinor. These are the eigenvalues of 
Hermitian operators and, for two-state systems, the relevant operators are the 
Pauli matrices. We therefore form the three observables 


The nz are the components of a single vector in the quantum theory of spin. 
Focusing attention on the components of this vector, we have 


nı = (Yll) = ap* + a*B, 
ng = (|G2|v) = i(aB* — a* 8), (8.12) 
ng = (Y|) = aa* — BB". 
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The magnitude of the vector with components nx is 
In]? = (@6* + a* 8)? — (ap* — a* 6)? + (aa* — 88")? 
= (la? + 16 = (lyy. (8.13) 


So, provided the state is normalised to 1, the vector n must have unit length. 
We can therefore introduce polar coordinates and write 


nı = sin(@) cos(¢), 
ng = sin(0) sin(@), (8.14) 


n3 = cos(6). 


Comparing equation (8.14) with equation (8.12) we see that we must have 
a = cos(6/2)e7, B = sin(0/2)e% (8.15) 


where ô — y = @. It follows that the spinor can be written in terms of the polar 
coordinates of the vector observable as 


—iġ/2\ 
W) = e ) il + 8)/2., (8.16) 


The overall phase factor can be ignored, and what remains is a description in 
terms of half-angles. This suggests a strong analogy with rotors. To investigate 
this analogy, we use the idea that polar coordinates can be viewed as part of an 
instruction to rotate the 3 axis onto the chosen vector. To expose this we write 
the vector n as 


n = sin(9)(cos(d)o1 + sin()a2) + cos(A)o3. (8.17) 
This can be written 
n = Ro3R, (8.18) 
where 
R = e7 ®103/2 002/2, (8.19) 


This suggests that there should be a natural map between the normalised spinor 
of equation (8.16) and the rotor R. Both belong to linear spaces of real dimension 
four and both are normalised. Expanding out the rotor R the following one-to- 
one map is found: 
04 4,3 
a +a 0 k 

|p) = E n a) < p=a +a`Iog. (8.20) 
This map will enable us to perform all operations involving spinors without 
leaving the geometric algebra of space. Throughout this chapter we use the <— 
symbol to denote a one-to-one map between conventional quantum mechanics 
and the multivector equivalent. We will continue to refer to the multivector w 
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as a spinor. On this scheme the spin-up and spin-down basis states | T) and | |) 
become 


If}ol |) e —Ioe. (8.21) 


One can immediately see for these that the vectors of observables have compo- 
nents (0,0,+1), as required. 


8.1.1 Pauli operators 


Now that a suitable one-to-one map has been found, we need to find a represen- 
tation for Pauli operators acting on the multivector version of a spinor. It turns 
out that the action of the quantum 6; operators on a state |y} is equivalent to 
the following operation on w: 


Geld) => orpoz (k=1,2,3). (8.22) 


The o3 on the right-hand side ensures that the multivector remains in the even 
subalgebra. The choice of vector does not break rotational covariance, in the 
same way that choosing the 63 matrix to be diagonal does not alter the rota- 
tional covariance of the Pauli theory. One can explicitly verify that the trans- 
lation procedure of equation (8.20) and equation (8.22) is consistent by routine 
computation; for example 


ale) = ( 


—a? + ia! 


a? + ia? ) = —a + aIo — a? Io + aIo =a1W03. (8.23) 


The remaining cases, for G2 and G3 can be checked equally easily. 

Now that we have a translation for the action of the Pauli matrices, we can 
find the equivalent of multiplying by the unit imaginary i. To find this we note 
that 


i 0 
516203 = 8.24 
010203 ({ i) , ( ) 
so multiplication of both components of |W) by i can be achieved by multiply- 
ing by the product of the three matrix operators. We therefore arrive at the 
translation 


ilp) > 0102031)(03)° = wlos. (8.25) 


So, on this scheme, the unit imaginary of quantum theory is replaced by right 
multiplication by the bivector Io3. This is certainly suggestive, though it should 
be borne in mind that this conclusion is a feature of our chosen representa- 
tion. The appearance of the bivector Io is to be expected, since the vector 
of observables s = 5,0, was formed by rotating the o3 vector. This vector is 
unchanged by rotations in the Jo3 plane, which provides a geometric picture of 
phase invariance. 
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8.1.2 Observables in the Pauli theory 


We next need to establish the quantum inner product for our multivector form 
of a spinor. We first note that the Hermitian adjoint operation has ai = 6x, and 
reverses the order of all products. This is precisely the same as the reversion 
operation for multivectors in three dimensions, so the dagger symbol can be used 


consistently for both operations. The quantum inner product is 


1 


(ula) =w (3 


) = vier + vio, (8.26) 


where we ignore spatial integrals. For a wide range of problems the spatial and 
spin components of the wave function can be separated. If this is not the case 
then the quantum inner product should also contain an integral over all space. 
The result of the real part of the inner product is reproduced by 


Rele) = (ie), (8.27) 
so that, for example, 
3 
(Wl) (ty) = ((a° — Ioja + a*Tox)) = Y atat. (8.28) 
a=0 
Since 
(Vip) = Reto) — iRe(vli¢), (8.29) 


the full inner product can be written 


(1d) = (Wig) — (hb! bles) Ios. (8.30) 


The right-hand side projects out the 1 and Jo3 components from the geometric 
product 7~'¢. The result of this projection on a multivector A is written (A)4q. 
For even-grade multivectors in three dimensions this projection has the simple 
form 


If the result of an inner product is used to multiply a second multivector, one 
has to remember to keep the terms in Jo3 to the right of the multivector. This 
might appear a slightly clumsy procedure at first, but it is easy to establish con- 
ventions so that manipulations are just as efficient as in the standard treatment. 
Furthermore, the fact that all manipulations are now performed within the geo- 
metric algebra framework offers a number of new ways to simplify the analysis 
of a range of problems. 
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8.1.3 The spin vector 


As a check on the consistency of our scheme, we return to the expectation value 
of the spin in the k-direction, (||). For this we require 


(Wkly) > (orpo) — (Wo, l)Io3. (8.32) 


Since Yt Io, reverses to give minus itself it has zero scalar part, so the final 
term on the right-hand side vanishes. This is to be expected, as the ôk are 
Hermitian operators. For the remaining term we note that in three dimensions 
posyp? is both odd-grade and reverses to itself, so is a pure vector. We therefore 
define the spin vector 


s = thyo3y'. (8.33) 
The quantum expectation now reduces to 


(plsk) = Hhlorposyt) = op's. (8.34) 


This new expression has a rather different interpretation to that usually en- 
countered in quantum theory. Rather than forming the expectation value of a 
quantum operator, we are simply projecting out the kth component of the vec- 
tor s. Working with the vector s may appear to raise questions about whether 
we are free to talk about all three components of the spin vector. This is in fact 
consistent with the results of spin measurements, if we view the spin measure- 
ment apparatus as acting more as a spin polariser. This is discussed in Doran 
et al. (1996b). 

The rotor description introduced at the start of this section is recovered by 
first defining the scalar 


p=. (8.35) 
The spinor Y% then decomposes into 
Y = p'/*R, (8.36) 


where R = p~'/2u. The multivector R satisfies RRÌ = 1, so is a rotor. In this 
approach, Pauli spinors are nothing but unnormalised rotors. The spin vector s 
can now be written as 


s = 1hpRo3R, (8.37) 


2 
which recovers the form of equation (8.18). 

The double-sided construction of the expectation value of equation (8.32) con- 
tains an instruction to rotate the fixed o3 axis into the spin direction and dilate 
it. It might appear here that we are singling out some preferred direction in 
space. But in fact all we are doing is utilising an idea from rigid-body dynamics, 
as discussed in section 3.4.3. The o3 on the right of y represents a vector in a 
‘reference’ frame. All physical vectors, like s, are obtained by rotating this frame 
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Oi 


Figure 8.2 The spin vector. The normalised spinor w~ transforms the ini- 
tial reference frame onto the frame {ex}. The vector e3 is the spin vector. 
A phase transformation of w generates a rotation in the e1e2 plane. Such 
a transformation is unobservable, so the eı and e2 vectors are also unob- 
servable. 


onto the physical values (see figure 8.2). There is nothing special about o3 — 
one can choose any (constant) reference frame and use the appropriate rotation 
onto s, in the same way that there is nothing special about the orientation of 
the reference configuration of a rigid body. In rigid-body mechanics this freedom 
is usually employed to align the reference configuration with the initial state of 
the body. In quantum theory the convention is to work with the z axis as the 
reference vector. 


8.1.4 Rotating spinors 


Suppose that the vector s is to be rotated to a new vector RosRi. To achieve 
this the spinor wv must transform according to 


wre Roy. (8.38) 


Now suppose that for Ro we use the rotor Rọ, 


Ro = exp(—B6/2), (8.39) 
where Ê? = —1 is a constant bivector. The resulting spinor is 
wl = Roy =e 24/2, (8.40) 


We now start to increase 0 from 0 through to 27, so that 0 = 27 corresponds to 
a 2r rotation, bringing all observables back to their original values. But under 
this we see that ~ transforms to 


vl =e 8% = (cos(m) — Bsin(m))y = —¥. (8.41) 


The spinor changes sign! If a spin vector is rotated through 27, the wavefunction 
does not come back to itself, but instead transforms to minus its original value. 


274 


8.1 NON-RELATIVISTIC QUANTUM SPIN 


This change of sign of a state vector under 27 rotations is the distinguish- 
ing property of spin-1/2 fermions in quantum theory. Once one sees the rotor 
derivation of this result, however, it is rather less mysterious. Indeed, there are 
classical phenomena involving systems of linked rotations that show precisely 
the same property. One example is the 47 symmetry observed when rotating an 
arm holding a tray. For a more detailed discussion if this point, see chapter 41 
of Gravitation by Misner, Thorne & Wheeler (1973). A linear space which is 
acted on in a single-sided manner by rotors forms a carrier space for a spin rep- 
resentation of the rotation group. Elements of such a space are generally called 
spinors, which is why that name is adopted for our representation in terms of 
even multivectors. 


8.1.5 Quantum particles in a magnetic field 


Particles with non-zero spin also have a magnetic moment which is proportional 
to the spin. This is expressed as the operator relation 


Âk = Y8k; (8.42) 
where jix is the magnetic moment operator, y is the gyromagnetic ratio and 5; 
is the spin operator. The gyromagnetic ratio is usually written in the form 


y (8.43) 


Im 
where m is the particle mass, q is the charge and g is the reduced gyromagnetic 
ratio. The reduced gyromagnetic ratios are determined experimentally to be 


electron ge=2 (actually 2(1 + a/2r +---)), 
proton Gp = 5.587, 
neutron gn = —3.826 (using proton charge). 


The value for the neutron is negative because its spin and magnetic moment 
are antiparallel. All of the above are spin-1/2 particles for which we have §, = 
(R/2)Gp. 

Now suppose that the particle is placed in a magnetic field, and that all of 
the spatial dynamics has been separated out. We introduce the Hamiltonian 
operator 


H = —3yhByoy = -Âr Bp. (8.44) 
The spin state at time t is then written as 
lyt) = a(t)| T) + 6O L, (8.45) 
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with a and 8 general complex coefficients. The dynamical equation for these 
coefficients is given by the time-dependent Schrodinger equation 
diy) 


Ay) = in (8.46) 


This equation can be hard to analyse, conventionally, because it involves a pair 
of coupled differential equations for œ and Ø. Instead, let us see what the 
Schrödinger equation looks like in the geometric algebra formulation. We first 
write the equation in the form 

dip) _ 


-y 7 DV BRGY). (BAT) 


Now replacing |W) by the multivector w we see that the left-hand side is simply 
w, where the dot denotes the time derivative. The right-hand side involves 
multiplication of the spinor |W) by i6;, which we replace by 


ikp) > orpoz los) = Ion). (8.48) 
The Schrödinger equation (8.46) is therefore simply 
sioe iBh, (8.49) 
where B = Bop. If we now decompose 7 into p!/?R we see that 
dot = 1+ pRR = 1 pyIB. (8.50) 
The right-hand side is a bivector, so p must be constant. This is to be expected, 
as the evolution should be unitary. The dynamics now reduces to 


R= 1yIBR, (8.51) 


so the quantum theory of a spin-1/2 particle in a magnetic field reduces to a 
simple rotor equation. This is very natural, if one thinks about the behaviour of 
particles in magnetic fields, and is an important justification for our approach. 

Recovering a rotor equation explains the difficulty of the traditional analysis 
based on a pair of coupled equations for the components of |y). This approach 
fails to capture the fact that there is a rotor underlying the dynamics, and so 
carries along a redundant degree of freedom in the normalisation. In addition, the 
separation of a rotor into a pair of components is far from natural. For example, 
suppose that B is a constant field. The rotor equation integrates immediately 
to give 


P(t) = eBt 2o. (8.52) 


The spin vector s therefore just precesses in the IB plane at a rate wo = y| B|. 
Even this simple result is rather more difficult to establish when working with 
the components of |y). 
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8.1.6 NMR and magnetic resonance imaging 


A more interesting example of a particle in a magnetic field is provided by nuclear 
magnetic resonance, or NMR. Suppose that the B field includes an oscillatory 
field (Bı cos(wt), Bı sin(wt),0) together with a constant field along the z axis. 
This oscillatory field induces transitions (spin-flips) between the up and down 
states, which differ in energy because of the constant component of the field. 
This is a very interesting system of great practical importance. It is the basis of 
magnetic resonance imaging and Rabi molecular beam spectroscopy. 
To study this system we first write the B field as 


Bı (cos(wt)o: + sin(wt)o2) + Boos = S(Bı10ı + Boss) St, (8.53) 
where 
S = e wlles/2, (8.54) 
We now define 
B. = Bio, + Boos (8.55) 
so that we can write B = SB.S'. The rotor equation now simplifies to 
Sth = LyIB.Sty, (8.56) 


where we have pre-multiplied by ST, and we continue to use ~ for the normalised 
rotor. Now noting that 


St = Lulo3st (8.57) 

we see that 
d 
dt 


It is now Sty that satisfies a rotor equation with a constant field. The solution 
is straightforward: 


(St) = $(qIB. + wlos) Sy. (8.58) 


Stytt) = exp( dot IB. + twt Ios) Yo, (8.59) 
and we arrive at 
w(t) = exp(—}ut Ios) exp (F(wo + w)t loz + twit Io) vo, (8.60) 


where wı = yB,. There are three separate frequencies in this solution, which 
contains a wealth of interesting physics. 

To complete our analysis we must relate our solution to the results of ex- 
periments. Suppose that at time t = 0 we switch on the oscillating field. The 
particle is initially in a spin-up state, so Yọ = 1, which also ensures that the state 
is normalised. The probability that at time t the particle is in the spin-down 
state is 


P, = (U E) (8.61) 


277 


QUANTUM THEORY AND SPINORS 


We therefore need to form the inner product 


Q YO) > Hop) = Uo) — Io3 (Toi). (8.62) 
To find this inner product we write 

w(t) = e ##73/2 (cos(at/2) + IB sin(at/2)), (8.63) 
where 


Be (wo +w)o3 +0101 


and a= 4/(w +w)? +}. (8.64) 
a 


The only term giving a contribution in the Jo, and Io» planes is that in wy Io, /a. 
We therefore have 
w sin(at/2) : 


—wtlo 3/2 To. (8.65) 
Q 


(loot) q = 


and the probability is immediately 


pa SE; (8.66) 


Q 


The maximum value is at at = 7, and the probability at this time is maximised 
by choosing @ as small as possible. This is achieved by setting w = —wọ = —yBo. 
This is the spin resonance condition which is the basis of NMR spectroscopy. 


8.2 Relativistic quantum states 


The relativistic quantum dynamics of a spin-1/2 particle is described by the 
Dirac theory. The Dirac matrix operators are 


0 eee): ae 


where ¥5 = —i40417243 and | is the 2 x 2 identity matrix. These matrices act 
on Dirac spinors, which have four complex components (eight real degrees of 
freedom). We follow an analogous procedure to the Pauli case and map these 
spinors onto elements of the eight-dimensional even subalgebra of the spacetime 
algebra. Dirac spinors can be visualised as decomposing into ‘upper’ and ‘lower’ 


components, 
_ (|?) 


where |} and |7) are a pair of two-component spinors. We already know how 
to represent these as multivectors ¢ and 7, which lie in the space of scalars + 
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relative bivectors. Our map from the Dirac spinor onto an element of the full 
eight-dimensional subalgebra is simply 


In) 


The action of the Dirac matrix operators now becomes, 


ul) E VpWYo (u=0,...,3), 
ily) = y Ios, (8.70) 
slh) > yos. 


Again, verifying the details of this map is a matter of routine computation. One 
feature is that we now have two ‘reference’ vectors that can appear on the right- 
hand side of w: yo and y3. That is, the relative vector o3 used in the Pauli 
theory has been decomposed into a spacelike and a timelike direction. As in the 
Pauli theory, these reference vectors multiplying w from the right do not break 
Lorentz covariance, as all observables are formed by rotating these reference 
vectors onto the frame of observables. Since Jo3 and yo commute, our use of 


W) = A > p= 0+ nos. (8.69) 


right-multiplication by Jo3 for the complex structure remains consistent. 

The goal of our approach is to perform all calculations without ever having to 
introduce an explicit matrix representation. The explicit map of equation (8.69) 
is for column spinors written in the Dirac—Pauli representation, but it is a simple 
matter to establish similar maps for other representations. All one needs to do 
is find the unitary matrix which transforms the second representation into the 
Dirac-Pauli one, and then apply the map of equation (8.69). All of the matrix 
operators are then guaranteed to have the equivalence defined in equation (8.70). 
Certain other operations, such as complex conjugation, depend on the particular 
representation. But rather than think of these as the same operation in different 
representations, it is simpler to view them as different operations which can be 
applied to the multivector w. 

In order to discuss the observables of the Dirac theory, we must first distinguish 
between the Hermitian and Dirac adjoints. The Hermitian adjoint is written as 
usual as (4|. The Dirac adjoint is written as (7| and is defined by 


l = (bul, = (Vl), (8.71) 


where the subscripts u and | refer to the upper and lower components. It is 
the Dirac adjoint which gives Lorentz-covariant observables. The Dirac inner 
product decomposes into 


(ID) = (Wulu) — (Wilgi). (8.72) 
This has the equivalent form 


(Widudg — Wi dq = (bt — oshi) lbu + d103))q = (Wo)q- (8.73) 
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So the Dirac adjoint is replaced by the manifestly covariant operation of space- 
time reversion in the spacetime algebra formulation. The Hermitian adjoint now 
becomes 


(W| > ot = 0d, (8.74) 


which defines the meaning of the dagger symbol in the full spacetime algebra. 
Clearly, this operation requires singling out a preferred timelike vector, so is not 
covariant. In the relative space defined by yo, the Hermitian adjoint reduces to 
the non-relativistic reverse operation, so our notation is consistent with the use 
of the dagger for the reverse in three-dimensional space. 

We can now look at the main observables formed from a Dirac spinor. The 
first is the current 


Jn = (lulh) © (by) — Eyup Iyos. (8.75) 


The final term contains (y, YI 73). This vanishes because W173 is odd-grade 
and reverses to minus itself, so is a pure trivector. Similarly, Wyo% is a pure 
vector, and we are left with 


Ju T (b|Fule) E Yu (Wyo). (8.76) 


As with the Pauli theory, the operation of taking the expectation value of a 
matrix operator is replaced by that of picking out a component of a vector. We 
can therefore reconstitute the full vector J and write 


J = poh (8.77) 


for the first of our observables. 
To gain some further insight into the form of J, and its formation from w, we 
introduce the scalar + pseudoscalar quantity ww as 


wp = peP. (8.78) 
Factoring this out from w, we define the spacetime rotor R: 
R=wpp Ve I8/2, RŘ=1. (8.79) 


(If p = 0 a slightly different procedure can be used.) We have now decomposed 
the spinor w into 


Y = pl2el8/2 R, (8.80) 
which separates out a density p and the rotor R. The remaining factor of 8 
is curious. It turns out that plane-wave particle states have G = 0, whereas 


antiparticle states have B = m. The picture for bound state wavefunctions is 
more complicated, however, and 3 appears to act as a remnant of multiparticle 
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Bilinear Standard STA Frame-free 
covariant form equivalent form 
Scalar (Bl) (yb) _ pcos(8) 
Vector (Dluh) y(dwd) dod =J 
Bivector —(Wlijul¥) (uan) (load) ylosý =S 
Pseudovector (wl4u7sl¥) — w(u) «a = 
Pseudoscalar (Yliĝs|Y) (wwl) —psin(Z) 


Table 8.1 Observables in the Dirac theory. The standard expressions for 
the bilinear covariants are shown, together with their spacetime algebra 
(STA) equivalents. 


effects from the full quantum field theory. With this decomposition of p, the 
current becomes 


J = pop = pe”? Ry Re!F/? = pRyoR. (8.81) 


So the rotor is now an instruction to rotate yo onto the direction of the current. 
This is precisely the picture we adopted in section 5.5 for studying the dynamics 
of a relativistic point particle. 

A similar picture emerges for the spin. In relativistic mechanics angular mo- 
mentum is a bivector quantity. Accordingly, the spin observables form a rank-2 
antisymmetric tensor, with components given by 


(Des Aue — Wild) > byun) = wAwylosy), (8.82) 


where again there is no imaginary component. This time we are picking out the 
components of the spin bivector S, given by 


S = wvlo3w. (8.83) 


This is the natural spacetime generalisation of the Pauli result of equation (8.18). 
(Factors of #/2 can always be inserted when required.) There are five such 
observables in all, which are summarised in Table 8.1. Of particular interest is 
the spin vector s = pRy3R. This justifies the classical model of spin introduced 
in section 5.5.6, where it was shown that the rotor form of the Lorentz force law 
naturally gives rise to a reduced gyromagnetic ratio of g = 2. 


8.3 The Dirac equation 


While much of the preceding discussion is both suggestive about the role of 
spinors in quantum theory, and algebraically very useful, one has to remem- 
ber that quantum mechanics deals with wave equations. We therefore need to 
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construct a relativistic wave equation for our Dirac spinor p, where w is an el- 
ement of the eight-dimensional even subalgebra of the spacetime algebra. The 
relativistic wave equation for a spin-1/2 particle is the Dirac equation. This is 
a first-order wave equation, which is both Lorentz-invariant and has a future- 
pointing conserved current. 

Like Pauli spinors, w is also subject to a single-sided rotor transformation law, 
wt Ry, where R is a Lorentz rotor. To write down a covariant equation, we 
can therefore only place other covariant objects on the left of ~. The available 
objects are any scalar or pseudoscalar, the vector derivative V and any gauge 
fields describing interactions. On the right of ~ we can place combinations of 
Yo, ¥3 and Io3. The first equation we could write down is simply 


Vu =0. (8.84) 


This is the spacetime generalisation of the Cauchy—Riemann equations, as de- 
scribed in section 6.3. Remarkably, this equation does describe the behaviour of 
fermions — it is the wave equation for a (massless) neutrino. Any solution to 
this decomposes into two separate solutions by writing 


Y = Yil +03) +4501 —o3) = y+ +y. (8.85) 


The separate solutions %4 and w_ are the right-handed and left-handed helicity 
eigenstates. For neutrinos, nature only appears to make use of the left-handed 
solutions. A more complete treatment of this subject involves the electroweak 


theory. (In fact, recent experiments point towards neutrinos carrying a small 
mass, whose origin can be explained by an interaction with the Higgs field.) 

The formal operator identification of 10, with p, tells us that any wavefunction 
for a free massive particle should satisfy the Klein-Gordon equation V?y = 
—m?*w. We therefore need to add to the right-hand side of equation (8.84) a term 
that is linear in the particle mass m and that generates —m?y) on squaring the 
operator. The natural covariant vector to form on the left of Y is the momentum 
y"p,. In terms of this operator we are led to an equation of the form 


pY = mhao, (8.86) 


where ao is some multivector to be determined. It is immediately clear that ao 
must have odd grade, and must square to +1. The obvious candidate is yo, so 
that 4% contains a rotor to transform yo to the velocity p/m. We therefore arrive 
at the equation 


This is the Dirac equation in its spacetime algebra form. This is easily seen to 
be equivalent to the matrix form of the equation 


"Wy — teAy)|b) = my), (8.88) 


282 


8.3 THE DIRAC EQUATION 


where the electromagnetic vector potential has been included. The full Dirac 
equation is now 

Vulo3 —eAw = mw. (8.89) 
A remarkable feature of this formulation is that the equation and all of its ob- 
servables have been captured in the real algebra of spacetime, with no need for 


a unit imaginary. This suggests that interpretations of quantum mechanics that 
place great significance in the need for complex numbers are wide off the mark. 


8.3.1 Symmetries and currents 


The subject of the symmetries of the Dirac equation, and their conjugate cur- 
rents, is discussed more fully in chapter 12. Here we highlight the main results. 
There are three important discrete symmetry operations: charge conjugation, 
parity and time reversal, denoted C, P and T respectively. Following the con- 
ventions of Bjorken & Drell (1964) we find that 


Plb) > yov(%)r0, 
Clb) = po, (8.90) 
Îl) = Iyyp(—r)y, 


where = yoxyo is (minus) the reflection of x in the timelike yo axis. The 
combined CPT symmetry corresponds to 


prs —Ip(—z) (8.91) 


so that CPT symmetry does not require singling out a preferred timelike vector. 

Amongst the continuous symmetries of the Dirac equation, the most significant 
is local electromagnetic gauge invariance. The equation is unchanged in physical 
content if we make the simultaneous replacements 


pr pera, eA m eA—Va. (8.92) 
The conserved current conjugate to this symmetry is the Dirac current J = pyy. 
This satisfies 
V-J = (Varo) + (bo) 
= —2((eAyyo + mp) loz) 
=0 (8.93) 


and so is conserved even in the presence of a background field. This is important. 
It means that single fermions cannot be created or destroyed. This feature was 
initially viewed as a great strength of the Dirac equation, though ultimately it 
is its biggest weakness. Fermion pairs, such as an electron and a positron, can 
be created and destroyed — a process which cannot be explained by the Dirac 
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equation alone. These are many-body problems and are described by quantum 
field theory. 
The timelike component of J in the yo frame, say, is 
Jo = yo: J = (you) = (bhp) > 0, (8.94) 


which is positive definite. This is interpreted as a probability density, and lo- 
calised wave functions are usually normalised such that 


fè J =1. (8.95) 


Arriving at a relativistic theory with a consistent probabilistic interpretation was 
Dirac’s original goal. 


8.3.2 Plane-wave states 
A positive energy plane-wave state is defined by 
Y = Woe PE, (8.96) 
where yo is a constant spinor. The Dirac equation (8.87) tells us that Yo satisfies 


po = mvo70, (8.97) 


and post-multiplying by wo we see that 


propo = mJ. (8.98) 


Recalling that we have oy = pe’?, and noting that both p and J are vectors, 
we see that we must have exp(i) = +1. For positive energy states the time- 
like component of p is positive, as is the timelike component of J, so we take 
the positive solution 6 = 0. It follows that wo is then simply a rotor with a 
normalisation constant. The proper boost L taking myo onto the momentum 
has 


p= mL yl = mL, (8.99) 


and from section 5.4.4 the solution is 
E 
ee E a (8.100) 
[2m(m +p) Bm(E +m)? 
where pyo = E + p. The full spinor wo is LU, where U is a spatial rotor in the 
yo frame, so is a Pauli spinor. 
Negative-energy solutions have a phase factor of exp(+Jo3p-x), with E = 
yo:p > 0. For these we have —pyw = mJ so it is clear that we now need 8 = 7. 
Positive and negative energy plane wave states can therefore be summarised by 


positive energy: y(x) = L(p)U e7 3P, 


(8.101) 
negative energy: Y (2) = L(p)U IPE, 
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with L(p) given by equation (8.100). The subscript r on the spatial rotors labels 
the spin state, with Up = 1, U1 = —Io2. These plane wave solutions are the 
fundamental components of scattering theory. 


8.3.3 Hamiltonian form and the Pauli equation 


The problem of how to best formulate operator techniques within spacetime 
algebra is little more than a question of finding a good notation. We could of 
course borrow the traditional Dirac ‘bra-ket’ notation, but we have already seen 
that the bilinear covariants are better handled without it. It is easier instead to 
just juxtapose the operator and the wavefunction on which it acts. But we saw 
in section 8.2 that the operators often act double-sidedly on the spinor w. This 
is not a problem, as the only permitted right-sided operations are multiplication 
by yo or Jog, and these operations commute. Our notation can therefore safely 
suppress these right-sided multiplications and gather all operations on the left. 
The overhat notation is useful to achieve this and we define 


Vd = WPW. (8.102) 


It should be borne in mind that all operations are now defined in the space- 
time algebra, so the 7, are not to be read as matrix operators, as they were in 
section 8.2. Of course, the action of the operators in either system is identical. 

It is also useful to have a symbol for the operation of right-sided multiplication 
by Io3. The symbol j carries the correct connotations of an operator that 
commutes with all others and squares to —1, and we define 


jp = plos. (8.103) 
The Dirac equation can now be written in the ‘operator’ form 
IVY — eÂy = my, (8.104) 
where 
Ùy =Vy p and Ap = Ady. (8.105) 


Writing the Dirac equation in the form (8.104) does not add anything new, but 
does confirm that we have an efficient notation for handling operators. One 
might ask why we have preferred the j symbol over the more obvious i. One 
reason is historical. In much of the spacetime algebra literature it has been 
common practice to denote the spacetime pseudoscalar with a small i. We now 
feel that this is a misleading notation, but it is commonplace. In addition, there 
are occasions when we may wish to formally complexify the spacetime algebra, 
as was the case for electromagnetic scattering, covered in section 7.5. To avoid 
confusion with either of these cases we have chosen to denote right-multiplication 
of y by Io3 as jy in both this and the following chapter. 
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To express the Dirac equation in Hamiltonian form we simply multiply from 
the left by yo. The resulting equation, with the dimensional constants temporar- 
ily put back in, is 


jhOwb = cp + eVy — ce Ay + mey, (8.106) 
where 
py = -jAV y, 
p = yopo, (8.107) 
pA=V —cA. 


Choosing a Hamiltonian is a non-covariant operation, since it picks out a pre- 
ferred timelike direction. The Hamiltonian relative to the yo direction is the 
operator on the right-hand side of equation (8.106). 

As an application of the Hamiltonian formulation, consider the non-relativistic 
reduction of the Dirac equation. This can be achieved formally via the Foldy- 
Wouthuysen transformation. For details we refer the reader to Itzykson & Zuber 
(1980). While the theoretical motivation for this transformation is clear, it can be 
hard to compute in all but the simplest cases. A simpler approach, dating back 
to Feynman, is to separate out the fast-oscillating component of the waves and 
then split into separate equations for the Pauli-even and Pauli-odd components 
of w. We write (with A = 1 and the factors of c kept in) 


Y = (bt nemet, (8.108) 


where ¢ = ¢ (Pauli-even) and 7 = —n (Pauli-odd). The Dirac equation (8.106) 
now splits into the two equations 


Eb — cOn = 0, 
; (8.109) 
(E + 2mc*)n — cCOd = 0, 
where 
Eb = (J: — eV) ¢, 
(o G 1 — eV) (8.110) 
O¢ = (P — eA)¢. 
The formal solution to the second of equations (8.109) is 
1 EN 


where the inverse on the right-hand side denotes a power series. Provided the 
expectation value of € is smaller than 2mc? (which it is in the non-relativistic 
limit) the series should converge. The remaining equation for ¢ is 


Ed = (1 H) 060, (8.112) 


2m 2mc? 
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which can be expanded out to the desired order of magnitude. There is little 
point in going beyond the first relativistic correction, so we approximate equa- 
tion (8.112) by 
OEO 0o? 
È e oan 8.113 
CET ome m? (8.113) 


We seek an equation of the form Eġ = Hd, where H is the non-relativistic 
Hamiltonian. We therefore need to replace the OEO term in equation (8.113) 
by a term that does not involve €. To achieve this we write 


20EO = [O, |E, O]] + EO? + OPE (8.114) 
so that equation (8.113) becomes 
OF EO? + OPE 1 
Ep= — $- 5-6-5510, Ol. (8.115) 
We can now make the approximation 
02 
z — .11 
Ega 6, (8.116) 
so that equation (8.113) can be approximated by 
02 1 ot 
EE NE Ie .11 
Eo ae” 8m2c2 [O, |E, Olle 8m3c2 Q, (8 7) 
which is valid to order c7?. 
To evaluate the commutators we first need 
JE, O] = —je(Q, A + VV) = jeE. (8.118) 


There are no time derivatives left in this commutator, so we do achieve a sensible 
non-relativistic Hamiltonian. The full commutator required in equation (8.117) 
is 
[O, [E, ol] = ay i eA, jeE] 
= (eV E) — 2eE^ V — 2je ANE. (8.119) 


The various operators (8.110) and (8.119) can now be substituted into equa- 
tion (8.117) to yield the Pauli equation 


ð$ 1, 5 pt 
5m Ê &A) tevo- ome’ 


-— aa (VE —2E\V)¢— 2? A^ Eglos), (8.120) 
TMC 


which is written entirely in the geometric algebra of three-dimensional space. 


In the standard approach, the geometric product in the VE term of equa- 
tion (8.120) is split into a ‘spin-orbit’ term V A E and the ‘Darwin’ term V - E. 
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The spacetime algebra approach reveals that these terms arise from a single 
source. 

A similar approximation scheme can be adopted for the observables of the 
Dirac theory. For example the current, wow, has a three-vector part: 


J = (Wyo) Aro = ont + ngt. (8.121) 


This is approximated to leading order by 
1 
Jx ——((V¢los¢")1 — Ad¢'), (8.122) 


where the (); projects onto the grade-1 components of the Pauli algebra. Not 
all applications of the Pauli theory correctly identify (8.122) as the conserved 
current in the Pauli theory — an inconsistency first pointed out by Hestenes & 
Gurtler (1971). 


8.4 Central potentials 


Suppose now that we restrict our discussion to problems described by a central 
potential V = V(r), A = 0, where r = |x|. The full Hamiltonian, denoted H, 
reduces to 


ghow = Hy = -jV yY + eV (rjy t+ my. (8.123) 


Quantum states are classified in terms of eigenstates of operators that commute 
with the Hamiltonian H, because the accompanying quantum numbers are con- 


served in time. Of particular importance are the angular-momentum operators 
L;, defined by 

Ly = —i€ijn0 jn. (8.124) 
These are the components of the bivector operator ix ^V. We therefore define 
the operators 


Lp =jB-(x£AV), (8.125) 


where B is a relative bivector. Throughout this section interior and exterior 
products refer to the (Pauli) algebra of space. Writing B = Io; recovers the 
component form. The Lpg operators satisfy the commutation relations 


[Lo LB] = —jLB,xBo: (8.126) 


where Bı x By denotes the commutator product. The angular-momentum com- 
mutation relations directly encode the bivector commutation relations, which are 
those of the Lie algebra of the rotation group (see chapter 11). One naturally 
expects this group to arise as it represents a symmetry of the potential. 

If we now form the commutator of Lg with the Hamiltonian H we obtain a 
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result that is, initially, disconcerting. The scalar operator Lg commutes with 
the bar operator 7 w, but for the momentum term we find that 


[B-(aAV), V] = -VB (^V) = BxV. (8.127) 
The commutator does not vanish, so orbital angular momentum does not yield a 


conserved quantum number in relativistic physics. But, since Bx V = 4(BV — 
V B), we can write equation (8.127) as 


[B-(x^AV)-— 5B,H] = 0. (8.128) 
We therefore recover a conserved angular momentum operator by defining 
Jg = Lg — jB. (8.129) 
In conventional notation this is 
J = i+ 4f, (8.130) 


where Š; = (i/2)eijkĵjĵk. The extra term of B/2 accounts for the spin-1/2 
nature of Dirac particles. If we look for eigenstates of the J3 operator, we see 
that the spin contribution to this is 


In the non-relativistic Pauli theory the eigenstates of this operator are simply 1 
and —Io2, with eigenvalues +1/2. In the relativistic theory the separate spin 
and orbital operators are not conserved, and it is only the combined Jg operators 
that commute with the Hamiltonian. 

The geometric algebra derivation employed here highlights some interesting 
features. Stripping away all of the extraneous terms, the result rests solely on 
the commutation properties of the B-(x^AV) and V operators. The factor of 1/2 
would therefore be present in any dimension, and so has no special relation to the 


three-dimensional rotation group. Furthermore, in writing Jg = Lg — ł jB we 
are forming an explicit sum of a scalar and a bivector. The standard notation of 
equation (8.130) encourages us to view these as the sum of two vector operators! 


8.4.1 Spherical monogenics 


The spherical monogenics play a key role in the solution of the Dirac equation 
for problems with radial symmetry. These are Pauli spinors (even elements of 
the Pauli algebra) that satisfy the eigenvalue equation 


—aAVw = ly. (8.132) 


These functions arise naturally as solutions of the three-dimensional generalisa- 
tion of the Cauchy-Riemann equations 


Vw =0. (8.133) 


289 


QUANTUM THEORY AND SPINORS 


Solutions of this equation are known in the Clifford analysis literature as mono- 
genics. Looking for solutions which separate into V = r'ẹy(0,ġ) yields equa- 
tion (8.132), where (r, 0, ġ) is a standard set of polar coordinates. The solutions 
of equation (8.132) are called spherical monogenics, or spin-weighted spherical 
harmonics (with weight 1/2). 

To analyse the properties of equation (8.132) we first note that 


[Je,2\V] =0, (8.134) 


which is proved in the same manner as equation (8.128). It follows that w 
can simultaneously be an eigenstate of the x A V operator and one of the Jg 
operators. To simplify the notation we now define 


Jeb = Ito, b = (Lor) (@AV) — 5lon) pos. (8.135) 
We choose 7 to be an eigenstate of J3. We label this state as w(I, u), so 
—tAVV(L, u) = (l, u), Jsv(1, u) = mp (l, u). (8.136) 
The J; operators satisfy 


JiJip(l, u) = 3/44 — 22V Y + £AV (xAV Y) 
= (1+ 1/2)(14+3/2)W(l, u), (8.137) 


so the w(1, u) are also eigenstates of J; Jj. 
We next introduce the ladder operators J, and J_, defined by 


J} = Ji + jJ2, 
A i (8.138) 
- = J1 > JJ2. 
It is a simple matter to prove the following results: 
[J+, J-] = 233, I,J; = J-J} + J3 + Js”, (8.139) 
[J+, J3] = FJ4, JiJi = Ji Jd_ — J3 T Ie: i 


The raising operator J; increases the eigenvalue of Jz by an integer. But, for 
fixed l, u must ultimately attain some maximum value. Denoting this value as 
H+, we must reach a state for which 


Tel, p+) = 0. (8.140) 


Acting on this state with J;J; and using one of the results in equation (8.139) 
we find that 


(1+1/2)(1+ 3/2) = p(y + 1). (8.141) 
Since l is positive and u+ represents an upper bound, it follows that 


u4 =141/2. (8.142) 
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There must similarly be a lowest eigenvalue of J3 and a corresponding state 
with 


Jyll, u) =0. (8.143) 

In this case we find that 
(14+ 1/2)(1+3/2) = u-(u- — 1), (8.144) 
hence u- = —(l + 1/2). The spectrum of eigenvalues of J3 therefore ranges from 


(1+ 1/2) to —(l + 1/2), a total of 2(1+ 1) states. Since the J3 eigenvalues are 
always of the form (integer +1/2), it is simpler to label the spherical monogenics 
with a pair of integers. We therefore write the spherical monogenics as 77”, 
where 


—£AVy” = ly” 1>0 (8.145) 
and 
Jap = (m + Wr —-l-i<m<l. (8.146) 
To find an explicit form for the y” we first construct the highest m case. This 
satisfies 
Jyt =0 (8.147) 
and it is not hard to see that this equation is solved by 
wi x sin! (9) e7103, (8.148) 


This is the angular part of the monogenic function (x + ylo3)'. Introducing a 
convenient factor, we write 

yl = (21 + 1) P}(cos(@)) e073. (8.149) 

Our convention for the associated Legendre polynomials follows Gradshteyn & 
Ryzhik (1994), so we have 

rm (oe) 2m2 PE 

P; (2) = =p 72.) l daltm 

(Some useful recursion relations for the associated Legendre polynomials are 

discussed in the exercises.) The lowering operator J_ has the following effect 


on w: 
J_w = (—Agrp + cot() OgpIo3)e~ P73 — Ioa (Y + ogra). (8.151) 


x? — 1). (8.150) 


The final term just projects out the {1, Jo3} terms and multiplies them by —Io2. 
This is the analog of the lowering matrix in the standard formalism. The deriv- 
atives acting on w form 


(—dpe! + cot(0) AgviTos)e~ M73 = (21 + 1)21P!-1(cos(0))e!— D43, (8.152) 
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and, if we use the result that 
og = 06013, (8.153) 
we find that 
e (21P!—* (cos(0)) — Pl(cos(0)) Io g)el - Dġlos, (8.154) 


Proceeding in this manner, we are led to the following formula for the spherical 
monogenics: 


pe = ((l+ m + 1)P7” (cos(0)) — P™*1(cos(0))Iog)e"?/73, (8.155) 


in which / is a positive integer or zero, m ranges from —(l + 1) to l and the P™ 
are taken to be zero if |m| >l. The positive- and negative-m states are related 
by 


Pema) = (1 e pray, (8.156) 


from which it can be shown that 


lL+m+ 1)! yor, 


urlo) = (-1 E 


(8.157) 


The spherical monogenics presented here are unnormalised. Normalisation fac- 
tors are not hard to compute, and we find that 


m Qn I 
f a | do sin(0) yry? = ar E (8.158) 
If o, denotes the unit radial vector, oy = x/r we find that 
LAVO, = 207. (8.159) 
It follows that 
—xz^V (orpo) = —(l + 2)o,yos, (8.160) 


which provides an equation for the negative-l eigenstates. The possible eigenval- 
ues and degeneracies are summarised in Table 8.2. One curious feature of this 
table is that we appear to be missing a line for the eigenvalue l = —1. In fact 
solutions for this case do exist, but they contain singularities which render them 
unnormalisable. For example, the functions 


(8.161) 


have l = —1 and Jz eigenvalues +1/2 and —1/2 respectively. Both solutions are 
singular along the z axis, however, which limits their physical relevance. 
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l Eigenvalues of J3 Degeneracy 
2 5/2,...,—5/2 6 
1 3/2,...,—3/2 4 
0 1/2,...,-1/2 2 
(-1) z ? 
—2 1/2,...,—1/2 2 


Table 8.2 Eigenvalues and degeneracies for the yj" monogenics. 


8.4.2 The radial equations 


We can use the angular monogenics to construct eigenfunctions of the Dirac 
Hamiltonian of equation (8.123). Since the Jg operators commute with H, % 
can be placed in an eigenstate of J3. The operator JiJi must also commute with 
H, so (l+ 1/2)(1+3/2) is a good quantum number. The operator xAV does not 
commute with H, however, so both the yj” and ory o3 monogenics are needed 
in the solution. While x^ V does not commute with H, the operator 


É =4(1-a@AV) (8.162) 
does, as follows from 


(4o(1 — AV), V] = 240V —WVEAV =0. (8.163) 


We should therefore work with eigenstates of the K operator. This implies that 
(a) can be written for positive l as either 


w(@,l+1)=y"ulr) + orp vlr) Ios (8.164) 
or 
plz, —(l + 1)) T orp ozu(r) T pr Iov(r). (8.165) 


In both cases the second label in y(x,l + 1) specifies the eigenvalue of É. It is 
useful to denote this by «K, so we have 


Ky = wv, k=...,—2,—1,1,2,... (8.166) 


and « is a non-zero positive or negative integer. 

In equations (8.164) and (8.165) the radial functions u(r) and v(r) are ‘com- 
plex’ combinations of 1 and Jo3. In the case of the Hamiltonian of (8.123), with 
V(r) real, it turns out that the real and imaginary equations decouple, and it is 
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sufficient to treat u(r) and v(r) as real, scalar quantities. On substituting our 
trial functions into the Hamiltonian, we find that the radial equations reduce to 


uw (k— 1)/r —(E—eV(r)+m)\ /u 
@ a : —eV(r)—m (=k —1)/r vj` (816r) 
The same equation holds for all values of «. This successfully separates the Dirac 
equation in any radially-symmetric potential. As one might expect, we arrive 


at a pair of coupled first-order equations, as opposed to the single second-order 
equation familiar from Schrodinger theory. 


8.4.8 The hydrogen atom 


The radial equations describing the relativistic quantum theory of the hydrogen 
atom are obtained simply by setting eV = —Za/r, where a = e?/4r is the fine 
structure constant and Z is the atomic charge. The solution of the radial equa- 
tions is described in most textbooks on relativistic quantum mechanics. The 
conclusion is that the radial dependence is governed by a pair of hypergeomet- 
ric functions, which generalise the Laguerre polynomials of the non-relativistic 
theory. Rather than reproduce the analysis here, we instead present a more 
direct method of solving the equations, first given by Eddington (1936) in his 
unconventional Relativity Theory of Protons and Electrons. 
We start with the equation 


jVw an L mao = Ey. (8.168) 


We assume that w is in an eigenstate of K , SO we can write 
LAV = Y — KYW. (8.169) 
We now pre-multiply the Dirac equation by jæ and rearrange to find 
rOpwy +Y — Kyo = jæ (x + Za) w—jmaryow. (8.170) 
On introducing the reduced function Y = rw the equation simplifies to 
O,V = jo,(E — m4o)V + L (Zoe, + Ko). (8.171) 
We accordingly define the two operators 
F=—-jo,(E—m),  G=—(Zao, + Kĝo), (8.172) 


so that the Dirac equation reduces to 
ee: 
OW +) F+ = vw =0. (8.173) 


294 


8.4 CENTRAL POTENTIALS 


The F andG operators satisfy 
F? =m? — BP = f? 
g l (8.174) 
G = kR? — (Za)? =’, 


which define f and v. The operators also satisfy the anticommutation relation 
FG+GF = —2ZaE. (8.175) 


The next step is to transform to the dimensionless variable x = fr and remove 
the large-x behaviour by setting 


Y = pe., (8.176) 
The function ® now satisfies 


Cas (F-1)e=0 (8.177) 


x 


O,® + 


We are now in a position to consider a power series solution, so we set 


=r X Cha”, (8.178) 
n=0 


where the Cn are all multivectors. (In Eddington’s original notation these are 
his ‘e-numbers’.) The recursion relation is first-order and is given simply by 


(n+s+G)C, =- (4 a 7 OPE (8.179) 


Setting n = 0 we see that 
(s+ Ĝ)Co =0. (8.180) 


Acting on this equation with the operator (s — G) we see that we must have 
s? = G2 = v?. We set s = v in order that the wavefunction is well behaved at 
the origin. 

With the small and large x behaviour now separated out, all that remains 
is the power series. One can show that, in order for w to fall to zero at large 
distances, the series must terminate. We therefore set Ch441 = 0, and it follows 
that 

F 


(4 = 1) Cn =0, or FOn = fCn- (8.181) 


But we also have 


Ê 2 Ê Ê 
(G4) (n+v+G)Cy =- ($=) (4-3) Cn—1 = 0, (8.182) 
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so 
ae 
(20 +v)+G+ rá) Cr = 0. (8.183) 
If we write this as 
De etc dacs 
(2o +v) + pe + to) Cn = 0, (8.184) 
we find that we must have 
Zak 
nty— = =0. (8.185) 


This is precisely our energy quantisation condition. The equation is equivalent 
to 
E n+v 


(mE Za” oe 
which rearranges to the standard formula 
2 40 (Za)? 
Ef =m (1 a) 5 (8.187) 


where n is a non-negative integer. 
The non-relativistic formula for the energy levels is recovered by first recalling 
that a ~ 1/137 is small. We can therefore approximate to 


vw |x| = 141, (8.188) 
where | > 0 and 
(Za)? 1 
Bw 1 ; 8.189 
m( 2 n?+2n(l+1) + (+1)? (atap 


Subtracting off the rest mass energy we are left with the non-relativistic expres- 
sion 
(Za)? 1 mZ?et 1 
| Ope = 8.190 
NRT M 3 (ntl4l 32m eh? n? ey) 


where n’ = n+1+ 1 and the dimensional constants have been reinserted. We 
have recovered the familiar Bohr formula for the energy levels. This derivation 
shows that the relativistic quantum number n differs from the Bohr quantum 
number n’. 

Expanding to next order we find that 


CZs co n T 


EnR= (8.191) 


Qn’? i Qn? \l+1 4 


The first relativistic correction shows that the binding energy is increased slightly 
from the non-relativistic value, and also introduces some dependence on the 
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Energy 3D5/2——— 
3 P32 ——__ 3D3/2 
38, /2 ————_  — 
2P3/2 
251/2 =] | Fine structure 
ro 2P j2 ~ 
Lamb shift 
1S1/2 Hyperfine structure 


Figure 8.3 Hydrogen atom energy levels. The diagram illustrates how 
various degeneracies are broken by relativistic and spin effects. The Dirac 
equation accounts for the fine structure. The hyperfine structure is due to 
interaction with the magnetic moment of the nucleus. The Lamb shift is 
explained by quantum field theory. It lifts the degeneracy between the S1/2 
and P,/2 states. 


angular quantum number l. This lifts some degeneracies present in the non- 
relativistic solution. The various corrections contributing to the energy levels 
are shown in figure 8.3. A more complete analysis also requires replacing the 
electron mass m by the reduced mass of the two-body system. This introduces 
corrections of the same order of the relativistic corrections, but only affects the 
overall scale. 


8.5 Scattering theory 


Many of the experimental tests of Dirac theory, and quantum electrodynamics 
in general, are based on the results of scattering. Here we see how our new 
formulation can help to simplify these calculations through its handling of spin. 
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To aid this analysis it is useful to introduce the energy projection operators 


Asy = =— (my <= pyro); (8.192) 


~ 2m 


which project onto particle and antiparticle states. 

A key role in relativistic quantum theory is played by Feynman propagators, 
which provide a means of imposing causal boundary conditions. We start by 
replacing the Dirac equation with the integral equation 


w(x) = pil) + e f ata! Sp(a — 2’) A(x ylz), (8.193) 


where w;(a) is the asymptotic in-state and solves the free-particle equation, and 
Spr(a — 2’) is the propagator. Substituting (8.193) into the Dirac equation, we 
find that Sp (x — x’) must satisfy 


jVeSp(a — 2'\b(2')yo — MS p(x — rpl) = 64 (x — rla’). (8.194) 


The solution to this equation is 


d£ 1 1 . y 
Sete- oma) = f E Pe ws mee oana), a10 


The factor of je is a mnemonic device to tell us how to negotiate the poles in the 
complex energy integral, which is performed first. The factor ensures positive- 
frequency waves propagate into the future (t > t’) and negative-frequency waves 
propagate into the past (t’ > t). The result of performing the energy integration 
is summarised in the expression 


Ce ene J L sp (Hise? +O(—t)Ac?”*), (8.196) 


where E = +y p? + m?. 

There are other choices of relativistic propagator, which may be appropriate 
in other settings. For classical electromagnetism, for example, it is necessary to 
work with retarded propagators. If one constructs a closed spacetime surface 
integral, with boundary conditions consistent with the field equations, then the 
choice of propagator is irrelevant, since they all differ by a spacetime monogenic 
function. In most applications, however, we do not work like this. Instead we 
work with initial data, which we seek to propagate to a later time in such a way 
that the final result is consistent with imposing causal boundary conditions. In 
this case one has to use the Feynman propagator for quantum fields. 
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8.5.1 Electron scattering 


In scattering calculations we write the wavefunction as the sum of an incoming 
plane wave and a scattered beam, 


V(x) = Vila) + Vaip (2). (8.197) 


At asymptotically large times Wag is given by 


d ; , 
pag) = -2mje | dta! | 5s gp AWe) E. (8.198) 


This can be written as a sum over final states 


dp, 1 ae ae 
taO | baa (8.199) 


where the final states are plane waves with 
ipa | dix (pp AlL Wa!) + mA(a")b(2’}yo) IPE”. (8.200) 
The number of scattered particles is given by (recalling that J = yoy) 
dp, 1 Jy dp; 1 
3 f Yo's fF f 
TAA = ae 201 
fa #0" Saige hen 2E; ( DE; ) ee on? (32u) 


where Ny is the number density per Lorentz-invariant phase space interval: 


yo Je — yo-(Wrrovr) _ pr 
T oP 2E; 2m ene) 


The integral equation (8.193) is the basis for a perturbative approach to solving 
the Dirac equation in an external field. We seek the full propagator S4 which 
satisfies 


(jV2 — eA(x2)) SA (£2, £1)0 — MSA (£2, £1) = 8t (£2 — x1). (8.203) 


The iterative solution to this is provided by 


Sa(vp,2i) = Sp(ap — zi) + fan Sr(ap — 1)eA(a1)Sp(a1 — 2) 
+ J dtz d*x5 Splzf— xı)eÂ(z1)Sp (z1 = z2)eÂ(z2)SF (z2 =i) t, 
(8.204) 


which is the basis for a diagrammatic representation of a scattering calculation. 
In the Born approximation we work to first order and truncate the series for S4 
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after the first interaction term. Assuming incident plane waves of momentum 
pi, so that w(x) = vy; exp(—jp;i-x), we find that the final states become 


py = je | da! (pA) + Alapi ie 
= —je(ppA(q) + Alq)pi) Vi, (8.205) 
where q = py — p; is the change in momentum, and A(q) is the Fourier transform 


of the electromagnetic potential. The form of the result here is quite typical, 
and in general we can write 


Wr = Siti, (8.206) 
where Sf; is the scattering operator. This is a multivector that takes initial states 


into final states. Since both yp; and wy are plane-wave particle states, we must 
have 


SpiS pi = Pfis (8.207) 
where pr; is a scalar quantity (which determines the cross section). We can 
therefore decompose Spi as 

Spi = py Ri (8.208) 
where Ry; is a rotor. This rotor takes the initial momentum to the final momen- 
tum, 


Rpt pa = pr- (8.209) 


8.5.2 Spin effects in scattering 


The multivector Sf; depends on the initial and final momenta and, in some cases, 
the initial spin. The final spin is determined from the initial spin by the rotation 
encoded in Sp;. If s; and sp denote the initial and final (unit) spin vectors, we 
have 


sp = RpisiRyi. (8.210) 


Sometimes it is of greater interest to separate out the boost terms in Ry; to 
isolate a pure rotation in the yg frame. This tells us directly what happens to 
the spin vector in the electron’s rest frame. With L; and Ly the appropriate 
pure boosts, we define the rest spin scattering operator 


Upi = Lp Ryili. (8.211) 
This satisfies 
UsivoU ji = L Lp RypiRyils = 0; (8.212) 
so is a pure rotation in the yo frame. 


300 


8.5 SCATTERING THEORY 


The fact that pr Sp; = Spfipi ensures that Spi is always of the form 
Spi = —j(ppM + Mpi), (8.213) 


where M is an odd-grade multivector. In the Born approximation of equa- 
tion (8.205), for example, we have M = eA(q). In general, M can contain both 
real and imaginary terms, so we must write 


Spipi = —j (ps (Mr + §Mj) + (Mp + j Mi)pi) thi, (8.214) 
where M; and M, are independent of j. We can now use 
jhi = pilo = Sihi, (8.215) 


where Ŝ; is the initial unit spin bivector. Since S and p; commute, Sf; can still 
be written in the form of equation (8.213), with 


M = M, + M; ôi. (8.216) 


So M remains a real multivector, which now depends on the initial spin. This 
scheme is helpful if we are interested in any spin-dependent features of the scat- 
tering process. 


8.5.3 Positron scattering and pair annihilation 


Adapting the preceding results to positron scattering is straightforward. In this 
case a negative-energy plane wave arrives from the future and scatters into the 
past, so we set 


pila) = roel, p(x) = pper, (8.217) 
In this case repeating the analysis gives 
Srithi = —J(—pp Mi + Mtiro), (8.218) 


which we can write as 
Spi = j(ppM + Mpi). (8.219) 


This amounts to simply swapping the sign of Spi. In the Born approximation, 
q is replaced by —q in the Fourier transform of A(x), which will alter the factor 
M if A(x) is complex. 
The other case to consider is when the incoming electron is scattered into the 
past, corresponding to pair annihilation. In this case we have 
Sfi = —Jj(—p2M + Mp1), (8.220) 


where pı and pz are the incoming momenta of the electron and positron respec- 
tively. We decompose Spi as 


Spi = oy, I Rpa, (8.221) 
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since Sf; must now contain a factor of I to map electrons into positrons. This 
form for Sf; implies that 


SiSpi = pfi (8.222) 


The minus sign reflects the fact that the transformation between initial and final 
momenta is not proper orthochronous. 


8.5.4 Cross sections 


We must now relate our results to the cross sections measured in experiments. 

The scattering rate into the final states, per unit volume, per unit time, is given 

by 

-typy OE Pt 

= —— N; = = l 
VT VT 2Ef 2mVT 


Wri (8.223) 


where V and T denote the total volume and time respectively. The density py 
is given by 


py =|SpiSpiloi = pripi- (8.224) 


Here SiS pi = +pfi, where the plus sign corresponds to electron to electron 
and positron to positron scattering, and the minus sign to electron—positron 
annihilation. 

The differential cross section is defined as 


Wei 


Ka target density x incident flux ` (5:223) 
When Spi is of the form 
Spi = —j(2m)*54(Py — Pi)T pi, (8.226) 
where the -function ensures conservation of total momentum, we have 
(Spil? = VT(20)*64 (Pi — P) |T rl. (8.227) 


Working in the J; frame the target density is just p; so, writing the incident flux 
as X, we have 


1 
do = ;—(2r)*8 (P; — P:)|T pl’. (8.228) 
2MX 


Alternatively we may be interested in an elastic scattering with just energy 
conservation (Ey = E;) and 


Spi = —J2nd(Ey — EiT pi. (8.229) 
In this case 


[Spi]? = 2nTO(Es — Ei) |T pl’. (8.230) 
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A target density of 1/V and an incident flux of |J;| = pi|p;|/m then gives 
T 
A 
The total cross section is obtained by integrating over the available phase space. 

For the case of a single particle scattering elastically we find that 


dp, 1 ~x ITjl? 
= f O(E E)ITAP = f ao fil 8.232 
o= | as 26; [ples E lön? (8.232) 


do = — (Eş — Ei)|T pl. (8.231) 


This is usually expressed in terms of the differential cross section per solid angle: 
do = [Tri |2 
ds 16r? 


(8.233) 


8.5.5 Coulomb scattering 


As an application of our formalism consider Coulomb scattering from a nucleus, 
with the external field defined by 


—Ze 


A(x) = Tija 


(8.234) 


Working with the first Born approximation, M is given by M = eA(q), where 
A(q) is the Fourier transform of A(x) given by 


2rZe 


A(q) = q (Es — Ei)Yo (8.235) 
and q:yo = Ey — Ei. Writing 
Sfi = —j2rô( Ep a E,)T fi (8.236) 
and using energy conservation we find that 
Ze? 
Tri = er +q). (8.237) 
The cross section is therefore given by the Mott scattering formula: 
do Za? Za? 
= AR? — g’) = 1 — 6? sin?(6/2)) , 8.238 
We geo T) = F352 sin™(/2) et a (B23) 
where 
g = (Py —p;)? = 2p? (1 = cos(8)) and 8 = |p|/E. (8.239) 


The angle 0 measures the deviation between the incoming and scattered beams. 
In the low velocity limit the Mott result reduces to the Rutherford formula. 
The result is independent of the sign of the nuclear charge and, to this order, is 
obtained for both electron and positron scattering. 
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A significant feature of this derivation is that no spin sums are required. In- 
stead, all the spin dependence is contained in the directional information in Tfi. 
As well as being computationally more efficient, this method for organising cross 
section calculations offers deeper insights into the structure of the theory. For 
Coulomb scattering the spin information is contained in the rotor 


_ Po + YoP: 


where Ly and L; are the pure boosts from yo to pr and p; respectively. The 
behaviour of the rest spin is governed by the unnormalised rotor 


Upi = Ly (L¢ + £2)L; = LL, + Ly l,j, = 2((E +m)? + pyp,). (8.241) 


It follows that the rest-spin vector precesses in the p,/p; plane through an angle 
6, where 
sin(6) 


tan(6/2) = (E +m)/(E — m) +cos(6) 


(8.242) 


This method of calculating the spin precession for Coulomb scattering was first 
described by Hestenes (1982a). 


8.5.6 Compton scattering 


Compton scattering is the process in which an electron scatters off a photon. To 
lowest order there are two Feynman diagrams to consider, shown in figure 8.4. 
The preceding analysis follows through with little modification, and gives rise 
two terms of the form 


dtp pAg(x2) + A2(x2)pi 
M, = e 4 dé A 2\%2 2\%2)Pi 
oe / Bees i es aay eT 
x eft (Pf = P)ejz2: (p = Pi), (8.243) 
where 
A(x) = eet Ik # (8.244) 


is the (complex) vector potential. The vector e denotes the polarisation state, 
so k-e = 0 and e = —1. In relativistic quantum theory there appears to be no 
alternative but to work with a fully complex vector potential. 

Performing the integrations and summing the two contributions we arrive at 


i + ki)ei + €cpi (pi — kp ep + €ppi 
M= 2 2 4 s4 P (p i FEF f : 
e“ (2r)*0 (P) (e AA € Dp) , (8.245) 


where P = py + ky — pi — ki, so that the 6-function enforces momentum conser- 
vation. Gauge invariance means that we can set p;-€; = pj-¢¢ = 0, in which case 
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Pi Pf Pi Pf 


Figure 8.4 Compton scattering. Two diagrams contribute to the ampli- 
tude, to lowest order. 


M simplifies to 


kici = gk re 
M = e(2r)t8 (pr 4 i ki) ( SE 4 SEED on 
e* (2) O° (ps + ke — p ) (gee 2pi kp 7 


We now set 
Spi = —j(21)*8 (pr + key — pi — ki)T fa, (8.247) 


so that the cross section is given by equation (8.228). After a little work, and 
making use of momentum conservation, we find that 


Dicky | pm) 
pi ky picks 


Pate (aee? 24 (8.248) 
This is all that is required to calculate the cross section in any desired frame. 
Again, this derivation applies regardless of the initial electron spin. 

The same scheme can be applied to a wide range of relativistic scattering 
problems. In all cases the spacetime algebra formulation provides a simpler and 
clearer method for handling the spin, as it does not force us to work with a 
preferred basis set. In section 14.4.1 the same formalism is applied to scattering 
from a black hole. At some point, however, it is necessary to face questions of 
second quantisation and the construction of a relativistic multiparticle quantum 
theory. This is discussed in the following chapter. 


8.6 Notes 


A significant amount of new notation was introduced in this chapter, relating 
to how spinors are handled in spacetime algebra. Much of this is important in 
later chapters, and the most useful results of this approach are summarised in 
table 8.3. 

Quantum mechanics has probably been the most widely researched applica- 
tion of geometric algebra to date. Many authors have carried out investigations 
into whether the spacetime algebra formulation of Dirac theory offers any deeper 
insights into the nature of quantum theory. Among the most interesting of these 
are Hestenes’ work on zitterbewegung (1990), and his comments on the nature 
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Ope 8 
Pauli spinors |v) = Gian) => ~p=a°+a* lox 


Grlp) => orpos 
Pauli operators ilp) = Ilos = jy 


= t 
Pauli observables A E A wt 
Dirac spinors (2) <> w=¢4+ 703 
ae => WP 
; jv) = piss 
Dirac operators Fsheh) - vos 
(bly) = (bv) 


Dirac equation Vylos — eA = myyo 
iB onI, _ ~ 
Dirac observables Be oe vy J= Pop 
S = pios s=wysw 


b (x) = L(p)be*79?* 
Plane-wave states (x) = L(p)®a3e'72?'* 
L(p) = (po + m)/\/2m(E + m) 


Table 8.3 Quantum states and operators. This table summarises the main 
features of the spacetime algebra representation of Pauli and Dirac spinors 
and operators. 


of the electroweak group (1982b). Many authors have advocated spacetime al- 
gebra as a better computational tool for Dirac theory than the explicit matrix 
formulation (augmented with various spin sum rules). A summary of these ideas 
is contained in the paper ‘Electron scattering without spin sums’ by Lewis et 
al. (2001). Elsewhere, a similar approach has been applied to modelling a spin 
measurement (Challinor et al. 1996) and to the results of tunnelling experiments 
(Gull et al. 1993b). Much of this work is summarised in the review ‘Spacetime 
algebra and electron physics’ by Doran et al. (1996b). 

There is no shortage of good textbooks describing standard formulations of 
Dirac theory and quantum electrodynamics. We particularly made use of the 
classic texts by Itzykson & Zuber (1980), and Bjorken & Drell (1964). For a 
detailed exposition of the solution of the Dirac equation in various backgrounds 
one can do little better than Greiner’s Relativistic Quantum Mechanics (1990). 
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Also recommended is Grandy’s Relativistic Quantum Mechanics of Leptons and 
Fields (1991) which, unusually, does not shy away from the more problematic 
areas of the conceptual foundations of quantum field theory. 


8.1 


8.2 


8.3 


8.4 


8.5 


8.6 


8.7 Exercises 


The spin matrix operators §; are defined as a set of 2 x 2 Hermitian 
matrices satisfying the commutation relations [8;, §;] = ihe;;.8~. Given 


that 83 is defined by 
ee 1 0 
ad e 2) 


show that the remaining matrices are unique, up to an overall choice 
of phase. Find À and show that we can choose the phase such that 
Sk = ħ/2 p. 

Verify that the equivalence between Pauli spinors and even multivectors 
defined in equation (8.20) is consistent with the operator equivalences 


Gy, |W) <> OkO (k = 1,2,3). 


Suppose that two spin-1/2 states are represented by the even multivec- 
tors ¢ and w, and the accompanying spin vectors are 


81=¢036 and s2 = Yaz). 


Prove that the quantum mechanical formula for the probability of mea- 
suring state ¢ in state w satisfies 


_ WW? aa 
A Va 


where 0 is the angle between sı and s2. 


Verify that the Pauli inner product is invariant under both spatial rota- 
tions and gauge transformations (i.e. rotations in the Io3 plane applied 
to the right of the spinor Y). Repeat the analysis for Dirac spinors. 


Prove that the angular momentum operators Lg = jB-(aAV) satisfy 
[LB , LB] = —jLB,xp2- 

Prove that, in any dimension, 
[B-(xAV) — ¿B,V] =0, 

where B is a bivector. 
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8.7 


8.8 


8.9 


8.10 


The Majorana representation is defined in terms of a set of real matrices. 
Prove that the complex conjugation operation in this representation has 
the spacetime algebra equivalent 


lY) Maj © Por. 


Confirm that this anticommutes with the operation of multiplying by 
the imaginary. 

Prove that the associated Legendre polynomials satisfy the following 
recursion relations: 


dP” oy 
a-e EO mora) = (1-22) PPM), 
Pm 
a- FEO) — me pyn(e) = Q-a- m+ PP). 
Prove that the spherical monogenics satisfy 
mt mi — smm’ (l +m ++ 1)! 
fan Pir a= dwan ah 


From the result of equation (8.248), show that the cross section for 
scattering of a photon of a free electron (initially at rest) is determined 
by the Klein—Nishina formula 


do a? [wf j Wf wi, 2 
dQ” 4m? (o Wi We healer a) =a] 
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Multiparticle states and 
quantum entanglement 


The previous chapter dealt with the quantum theory of single particles in a 
background field. In this chapter we turn to the study of multiparticle quantum 
theory. In many ways, this subject is even more strange than the single-particle 
theory, as it forces us to face up to the phenomenon of quantum entanglement. 
The basic idea is simple enough to state. The joint state of a two-particle system 
is described by a tensor product state of the form |Y) ® |¢). This is usually 
abbreviated to |w)|¢). Quantum theory allows for linear complex superpositions 
of multiparticle states, which allows us to consider states which have no classical 
counterpart. An example is the spin singlet state 


o 1 
v2 
States such as these are referred to as being entangled. The name reflects the 


fact that observables for the two particles remain correlated, even if measure- 
ments are performed in such a way that communication between the particles 


le) (10)|1) — |1){0)). (9.1) 


is impossible. The rapidly evolving subject of quantum information processing 
is largely concerned with the properties of entangled states, and the prospects 
they offer for quantum computation. 

Quantum entanglement is all around us, though rarely in a form we can exploit. 
Typically, a state may entangle with its environment to form a new pure state. 
(A pure state is one that can be described by a single wavefunction, which may 
or may not be entangled.) The problem is that our knowledge of the state of 
the environment is highly limited. All we can measure are the observables of our 
initial state. In this case the wavefunction formulation is of little practical value, 
and instead we have to consider equations for the evolution of the observables 
themselves. This is usually handled by employing a representation in terms of 
density matrices. These lead naturally to concepts of quantum statistical physics 
and quantum definitions of entropy. 
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In this chapter we explore how these concepts can be formulated in the lan- 
guage of geometric algebra. One of the essential mysteries of quantum theory 
is the origin of this tensor product construction. The tensor product is used in 
constructing both multiparticle states and many of the operators acting on these 
states. So the first challenge is to find a representation of the tensor product 
in terms of the geometric product. This is surprisingly simple to do, though 
only once we have introduced the idea of a relativistic configuration space. The 
geometric algebra of such a space is called the multiparticle spacetime algebra 
and it provides the ideal algebraic structure for studying multiparticle states 
and operators. This has applications in a wealth of subjects, from NMR spec- 
troscopy to quantum information processing, some of which are discussed below. 
Most of these applications concern non-relativistic multiparticle quantum me- 
chanics. Later in this chapter we turn to a discussion of the insights that this 
new approach can bring to relativistic multiparticle quantum theory. There we 
find a simple, geometric encoding of the Pauli principle, which opens up a route 
through to the full quantum field theory. 


9.1 Many-body quantum theory 


In order to set the context for this chapter, we start with a review of the basics 
of multiparticle quantum theory. We concentrate in particular on two-particle 
systems, which illustrate many of the necessary properties. The key concept 
is that the quantum theory of n-particles is not described by a set of n single 
wavefunctions. Instead, it is described by one wavefunction that encodes the 
entire state of the system of n particles. Unsurprisingly, the equations governing 
the evolution of such a wavefunction can be extraordinarily complex. 

For a wide range of problems one can separate position degrees of freedom 
from internal (spin) degrees of freedom. This is typically the case in non- 
relativistic physics, particularly if the electromagnetic field can be treated as 
constant. In this case the position degrees of freedom are handled by the many- 
body Schrödinger equation. The spin degrees of freedom in many ways represent 
a cleaner system to study, as they describe the quantum theory of n two-state 
systems. This illustrates the two most important features of multiparticle quan- 
tum theory: the exponential increase in the size of state space, and the existence 
of entangled states. 


9.1.1 The two-body Schrödinger equation 


Two-particle states are described by a single wavefunction w(r1,r2). The joint 
vectors (r1, 72) define an abstract six-dimensional configuration space over which 
w defines a complex-valued function. This sort of configuration space is a useful 
tool in classical mechanics, and in quantum theory it is indispensable. The 
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kinetic energy operator is given by the sum of the individual operators: 


2072 202 
jee VI Ne (9.2) 


2m, 2M 


The subscripts refer to the individual particles, and m; is the mass of particle i. 
The two-particle Schrödinger equation is now 


Ow RV? PV? 
= t V(r, : 9.3 

at 2m1 Y 2m2 Y (r1,72)~ (9.3) 

As a simple example, consider the bound state Coulomb problem 

AYÉ WV ng 


2m, 2m2 Aregr 


ih 


Y = Ey, (9.4) 


where r is the Euclidean distance between the points rı and r2. This problem is 
separated in a similar manner to the classical Kepler problem (see section 3.2). 
We introduce the vectors 


r=rfri T2, m + ; (9.5) 
H my mg 


where p is the reduced mass. In terms of these new variables the Schrödinger 
equation becomes 


RV? PVR ng? 
Z - = Ey. i 
2u Y 2M y R Y 70) 
We can now find separable solutions to this equation by setting 
W(r1,7r2) = O(r)U(R). (9.7) 


The wavefunction Y satisfies a free-particle equation, which corresponds classi- 
cally to the motion of the centre of mass. The remaining term, $(r), satisfies 
the equivalent single-particle equation, with the mass given by the reduced mass 
of the two particles. 

This basic example illustrates how quantum mechanics accounts for multipar- 
ticle interactions. There is a single wavefunction, which simultaneously accounts 
for the properties of all of the particles. In many cases this wavefunction de- 
composes into the product of a number of simpler wavefunctions, but this is not 
always the case. One can construct states that cannot be decomposed into a 
single direct product state. An important example of this arises when the two 
particles in question are identical. In this case one can see immediately that if 
W(r1, T2) is an eigenstate of a two-particle Hamiltonian, then so to is W(r2,1r1). 
The operator that switches particle labels like this is called the particle inter- 
change operator P , and it commutes with all physically-acceptable Hamiltonians. 
Since it commutes with the Hamiltonian, and squares to the identity operation, 
there are two possible eigenstates of P. These are 


P = Yri, r2) + Y(r2, r1). (9.8) 
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These two possibilities are the only ones that arise physically, and give rise to the 
distinction between fermions (minus sign) and bosons (plus sign). Here we see 
the first indications of some new physical possibilities entering in multiparticle 
interactions. Quantum theory remains linear, so one can form complex super- 
positions of the n-particle wavefunctions. These superpositions can have new 
properties not present in the single-particle theory. 


9.1.2 Spin states 


Ignoring the spatial dependence and concentrating instead on the internal spin 
degrees of freedom, a spin-1/2 state can be written as a complex superposition of 
‘up’ and ‘down’ states, which we will denote as |0) and |1). Now suppose that a 
second particle is introduced, so that system 1 is in the state |W) and system 2 is 
in the state |}. The joint state of the system is described by the tensor product 
state 


1%) = |v) 8 |¢), (9.9) 


which is abbreviated to |)|¢). The total set of possible states is described by 
the basis 
|00) = |0)|0), |01) = |0}|1), 


(9.10) 
|10) = |1)|0), |11) = |1)/1). 


This illustrates an important phenomenon of multiparticle quantum theory. The 
number of available states grows as 2”, so large systems have an enormously 
larger state space than their classical counterparts. Superpositions of these basis 
states will, in general, produce states which cannot be written as a single tensor 
product of the form |)|¢). Such states are entangled. A standard example is 
the singlet state of equation (9.1). One feature of these entangled states is that 
they provide ‘short-cuts’ through Hilbert space between classical states. The 
speed-up this can offer is often at the core of algorithms designed to exploit the 
possibilities offered by quantum computation. 

A challenge faced by theorists looking for ways to exploit these ideas is how 
best to classify multiparticle entanglement. The problem is to describe concisely 
the properties of a state that are unchanged under local unitary operations. Local 
operations consist of unitary transformations applied entirely to one particle. 
They correspond to operations applied to a single particle in the laboratory. 
Features of the state that are unchanged by these operations relate to joint 
properties of the particles, in particular how entangled they are. 

To date, only two-particle (or ‘bipartite’) systems have been fully understood. 
A general state of two particles can be written 


Y=) ali) ® |j), (9.11) 


312 


9.1 MANY-BODY QUANTUM THEORY 


where the |i) denote some orthonormal basis. The Schmidt decomposition (which 
is little more than a singular-value decomposition of a;,;) tells us that one can 
always construct a basis such that 


Y= > Bile’) @ |i’). (9.12) 


The ĝ; are all real parameters that tell us directly how much entanglement is 
present. These parameters are unchanged under local transformations of the 
state Ų. An important example of the Schmidt decomposition, which we shall 
revisit frequently, is for systems of two entangled spinors. For these we find that 
a general state can be written explicitly as 


Ib) =pl/2ehx (esaye e J 2 w 


sin(0; /2)e%1/2 sin(2 /2)e!¢2/2 
; —ir/2 sin(/2)e~*#1/? sin(02/2)e~#2/? 
+ sin(a/2)e e ® | L cos(92/2)eit2/2 . (9.13) 


In this decomposition we arrange that 0 < a < 7/4, so that the decomposition 
is unique (save for certain special cases). 


9.1.3 Pure and mized states 


So far the discussion has focused entirely on pure states, which can be described 
in terms of a single wavefunction. For many applications, however, such a de- 
scription is inappropriate. Suppose, for example, that we are studying spin 
states in an NMR experiment. The spin states are only partially coherent, and 
one works in terms of ensemble averages. For example, the average spin vector 
(or polarisation) is given by 


Sle 


p= 


or (9.14) 


Unless all of the spin vectors are precisely aligned (a coherent state), the polar- 
isation vector will not have unit length and so cannot be generated by a single 
wavefunction. Instead, we turn to a formulation in terms of density matrices. 
The density matrix for a normalised pure state is 


ô = |Y), (9.15) 


which is necessarily a Hermitian matrix. All of the observables associated with 
the state |y) can be obtained from the density matrix by writing 


(WIQ|v) = tr(6Q). (9.16) 
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For an incoherent mixture (a mixed state) the density matrix is the weighted 
sum of the matrices for the pure states: 


p= XO pili) (bil. (9.17) 
i=1 
The real coefficients satisfy 
yop = 1, (9.18) 
i=1 


which ensures that the density matrix has unit trace. The definition of 6 ensures 
that all observables are constructed from the appropriate averages of the pure 
states. In principle, the state of any system is described by a Hermitian density 
matrix, which is constrained to be positive-semidefinite and to have unit trace. 
All observables are then formed according to equation (9.16). 

The need for a density matrix can be seen in a second way, as a consequence of 
entanglement. Suppose that we are interested in the state of particle 1, but that 
this particle has been allowed to entangle with a second particle 2, forming the 
pure state |y}. The density matrix for the two-particle system is again described 
by equation (9.15). But we can only perform measurements of particle 1. The 
effective density matrix for particle 1 is obtained by performing a partial trace 
of f to trace out the degrees of freedom associated with particle 2. We therefore 
define 


ra = trap, (9.19) 


where the sum runs over the space of particle 2. One can easily check that, in 
the case where the particles are entangled, 6, is no longer the density matrix 
for a pure state. The most extreme example of this is the singlet state (9.1) 
mentioned in the introduction. In the obvious basis, the singlet state can be 


written as 
1 
e} = — (0, 1, —1, ot. 9.20 
le) Ja! ) (9.20) 
The density matrix for this state is 
0 0 0 
1/0 1 -1 0 
ô = = — .21 
Paad a Ta (9.21) 
0 0 0 


This is appropriate for a pure state, as the matrix satisfies 6? = f. But if we 
now form the partial trace over the second particle we are left with 


fee ; l 1) (9.22) 
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This is the density matrix for a totally unpolarised state, which is to be expected, 
since there can be no directional information in the singlet state. Clearly, 1 
cannot be generated by a single-particle pure state. 


9.2 Multiparticle spacetime algebra 


The key to constructing a suitable geometric framework for multiparticle quan- 
tum theory involves the full, relativistic spacetime algebra. This is because it is 
only the relativistic treatment which exposes the nature of the a; as spacetime 
bivectors. This is crucial for determining their algebraic properties as further 
particles are added. The n-particle spacetime algebra is the geometric algebra 
of 4n-dimensional relativistic configuration space. We call this the multiparticle 
spacetime algebra. A basis is for this is constructed by taking n sets of basis 
vectors {Ya where the superscript labels the particle space. These satisfy the 
orthogonality conditions 


0 a#b 


} 9.23 
2M a=b ( 


ntti] 
which are summarised in the single formula 


ee = SPN. (9.24) 


There is nothing uniquely quantum-mechanical in this construction. A system 
of three classical particles could be described by a set of three trajectories in a 
single space, or by one path in a nine-dimensional space. The extra dimensions 
label the properties of each individual particle, and are not to be thought of as 
existing in anything other than a mathematical sense. One unusual feature con- 
cerning relativistic configuration space is that it requires a separate copy of the 
time dimension for each particle, as well as the three spatial dimensions. This 
is required in order that the algebra is fully Lorentz-covariant. The presence 
of multiple time coordinates can complicate the evolution equations in the rel- 
ativistic theory. Fortunately, the non-relativistic reduction does not suffer from 
this problem as all of the individual time coordinates are identified with a single 
absolute time. 

As in the single-particle case, the even subalgebra of each copy of the spacetime 
algebra defines an algebra for relative space. We perform all spacetime splits with 
the vector yo, using a separate copy of this vector in each particle’s space. A 
basis set of relative vectors is then defined by 


oF = VV- (9.25) 


Again, superscripts label the particle space in which the object appears, and 
subscripts are retained for the coordinate frame. We do not enforce the sum- 
mation convention for superscripted indices in this chapter. If we now consider 
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bivectors from spaces 1 and 2, we find that the basis elements satisfy 
Tio} = VI} = VG WV = FIV = TT. (9.26) 
The basis elements commute, rather than anticommute. This solves the problem 
of how to represent the tensor product in geometric algebra. The geometric 
product ogo? is the tensor product. Since single particle states are constructed 
out of geometric algebra elements, this gives a natural origin for tensor product 
states in the multiparticle case. This property only holds because the relative 
vectors of are constructed as spacetime bivectors. 
The pseudoscalar for each particle space is defined in the obvious way, so that 


Ie = 717273: (9.27) 


Relative bivectors in each space take the form I*o%. Wherever possible we 
abbreviate these by dropping the first particle label, so that 


Io? = I'o}. (9.28) 


The reverse operation in the multiparticle spacetime algebra is denoted with a 
tilde, and reverses the order of products of all relativistic vectors. Wherever 
possible we use this operation when forming observables. The Hermitian adjoint 
in each space can be constructed by inserting appropriate factors of y8. 


9.2.1 Non-relativistic states and the correlator 


In the single-particle theory, non-relativistic states are constructed from the even 
subalgebra of the Pauli algebra. A basis for these is provided by the set {1, Jo, }. 
When forming multiparticle states we take tensor products of the individual 
particle states. Since the tensor product and geometric product are equivalent 
in the multiparticle spacetime algebra, a complete basis is provided by the set 


{1, Io}, lok, Io} Ioz}. (9.29) 


But these basis elements span a 16-dimensional real space, whereas the state 
space for two spin-1/2 particles is a four-dimensional complex space — only 
eight real degrees of freedom. What has gone wrong? The answer lies in our 
treatment of the complex structure. Quantum theory works with a single unit 
imaginary i, but in our two-particle algebra we now have two bivectors playing 
the role of i: Io} and Io. Right-multiplication of a state by either of these 
has to result in the same state in order for the geometric algebra treatment to 
faithfully mirror standard quantum mechanics. That is, we must have 


plo} = wlo3. (9.30) 
Rearranging this, we find that 


yY = —yIo} Io} = 4 (1 — Io} Io). (9.31) 
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This tells us what we must do. If we define 
E = 3(1— Io} Io), (9.32) 
we find that 
F? =E. (9.33) 


So right-multiplication by E is a projection operation. If we include this factor 
on the right of all states we halve the number of (real) degrees of freedom from 
16 to the expected 8. 

The spacetime algebra representation of a direct-product two-particle Pauli 
spinor is now given by ọ!¢?FE, where y! and ¢? are spinors (even multivectors) 
in their own spaces. A complete basis for two-particle spin states is provided by 


|0)|0) > E, 
0)|1) > —Io? E, 
oli (9.34) 
|1)10) > -Io E, 
IDIL e Io} Io} E. 
We further define 
J = Elo} = Elo} = 4(Io} + Iø), (9.35) 
so that 
P=-E. (9.36) 


Right-sided multiplication by J takes on the role of multiplication by the quan- 
tum imaginary i for multiparticle states. 

This procedure extends simply to higher multiplicities. All that is required is 
to find the ‘quantum correlator’ En satisfying 


E,Io$ = E„lo$ = J„ for all a, b. (9.37) 


FE, can be constructed by picking out the a = 1 space, say, and correlating all 
the other spaces to this, so that 
En = | | 40 - Ie} Io). (9.38) 
b=2 


The value of En is independent of which of the n spaces is singled out and 
correlated to. The complex structure is defined by 


Jn = E,Ios, (9.39) 
where Ja} can be chosen from any of the n spaces. To illustrate this consider 


317 


MULTIPARTICLE STATES AND QUANTUM ENTANGLEMENT 


the case of n = 3, where 


E; = }(1 — Iø} Io3)(1 — Io} Io?) 
= 4(1 — Iø} Io} — Io} Io} — Io Io?) (9.40) 
and 
J3 = +103 + Io} + Io} — Io} Io} Io). (9.41) 


Both E and J3 are symmetric under permutations of their indices. 


9.2.2 Operators and observables 


All of the operators defined for the single-particle spacetime algebra extend nat- 
urally to the multiparticle algebra. In the two-particle case, for example, we 
have 


iôr @ Mlb) = Isly, (9.42) 
1@ i64|b) > Io2y, (9.43) 


where | is the 2 x 2 identity matrix and a factor of E is implicit in the spinor w. 
For the Hermitian operators we form, for example, 


ôr @M\b) > —IolwJ = oh veoh. (9.44) 
This generalises in the obvious way, so that 
1@---@6%@--- Qiy) > oto. (9.45) 


We continue to adopt the j symbol as a convenient shorthand notation for the 
complex structure, so 


ib) = jy = yJ = ylos. (9.46) 
The quantum inner product is now 
(|g) = 2 (HE0) — (I)i). (9.47) 


The factor of E in the real part is not strictly necessary as it is always present in 
the spinors, but including it does provide a neat symmetry between the real and 
imaginary parts. The factor of 2”71 guarantees complete consistency with the 
standard quantum inner product, as it ensures that the state E has unit norm. 

Suppose that we now form the observables in the two-particle case. We find 
that 


(plêk @ |b)  —2o}-(pIp) (9.48) 
and 


(| 6; ® ôr |b) + —2( Io} I0?) (YEY). (9.49) 
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All of the observables one can construct are therefore contained in the multivec- 
tors pE% and YJ y). This generalises to arbitrary particle numbers. To see why, 
we use the fact that any density matrix can be expanded in terms of products 
of Hermitian operators, as in the two-particle expansion 


ee By a A . 
P= |W) (Pl = FU@I+ ar ôr B Í+ brl Ge + Cyn Gj ôr). (9.50) 


The various coefficients are found by taking inner products with the appropriate 
combinations of operators. Each of these corresponds to picking out a term in 
pE% or YJ w. If an even number of Pauli matrices is involved we pick out a 
term in WE, and an odd number picks out a term in wd. In general, pE% 
contains terms of grades 0, 4, ..., and yTy contains terms of grade 2, 6, .... 
These account for all the coefficients in the density matrix, and hence for all the 
observables that can be formed from w. 

An advantage of working directly with the observables WEw and YJ w is that 
the partial trace operation has a simple interpretation. If we want to form the 
partial trace over the ath particle, we simply remove all terms from the observ- 
ables with a contribution in the ath particle space. No actual trace operation is 
required. Furthermore, this operation of discarding information is precisely the 
correct physical picture for the partial trace operation — we are discarding the 
(often unknown) information associated with a particle in one or more spaces. 
A minor complication in this approach is that YJ w gives rise to anti-Hermitian 
terms, whereas the density matrix is Hermitian. One way round this is to cor- 
relate all of the pseudoscalars together and then dualise all bivectors back to 
vectors. This is the approach favoured by Havel and coworkers in their work on 
NMR spectroscopy. Alternatively, one can simply ignore this feature and work 
directly with the observables pE% and wdd. When presented with a general 
density matrix one often needs to pull it apart into sums of terms like this any- 
way (the product operator expansion), so it makes sense to work directly with 
the multivector observables when they are available. 


9.3 Systems of two particles 


Many of the preceding ideas are most simply illustrated for the case of a system of 
two particles. For these, the Schmidt decomposition of equation (9.13) provides 
a useful formulation for a general state. The geometric algebra version of this is 
rather more compact, however, as we now establish. First, we define the spinor 


P(O, p) =e 92 3/2 ePlaa/2, (9.51) 
We also need a representation of the state orthogonal to this, which is 
sin(6/2)e~*9/? 
2 mie ie (Peal 
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Now we are in a position to construct the multiparticle spacetime algebra version 
of the Schmidt decomposition. We replace equation (9.13) with 


tb =p"? (cos(a/2)05"(01, 61) 0 (02, ba)e77/? 
+ sin(a/2)q" (01, 61)? (2, b2)Io5 Io}e~7/) e/XE 
=p!" (01, 61) 07 (Oo, ¢2)e77/? (cos(a/2) + sin(a/2)Io} Io?) e7XE. (9.53) 
We now define the individual rotors 
R= (0,1) 37/4, S= (62, p2) 037/4, (9.54) 
so that the wavefunction w simplifies to 
Y = p"? R! S? (cos(a/2) + sin(a/2)Io}Io3)e!XE. (9.55) 


This gives a compact, general form for an arbitrary two-particle state. The de- 
grees of freedom are held in an overall magnitude and phase, two separate rotors 
in the individual particle spaces, and a single entanglement angle a. In total this 
gives nine degrees of freedom, so one must be redundant. This redundancy lies 
in the single-particle rotors. If we take 


Re Re!738, S Seep (9.56) 


then the overall wavefunction w is unchanged. In practice this redundancy is not 
a problem, and the form of equation (9.55) turns out to be extremely useful. 


9.3.1 Observables for two-particle states 


The individual rotors R! and S? generate rotations in their own spaces. These 
are equivalent to local unitary transformations. The novel features associated 
with the observables for a two-particle system arise from the entanglement angle 
a. To study this we first form the bivector observable YJ q): 


Jy =R! S? (cos(a /2) + sin(a /2) Io} Io?) J (cos(a/2) + sin(a/2)Io} Io3) R+ 8? 
=4 R! S? (cos? (a/2) — sin?(a/2)) Io} + Io2) RIS? 
=} cos(a) ((RIa3R)' + (SIo35)?), (9.57) 
where we have assumed that p = 1. This result extends the definition of the spin 
bivector to multiparticle systems. One can immediately see that the lengths of 
the bivectors are no longer fixed, but instead depend on the entanglement. Only 


in the case of zero entanglement are the spin bivectors unit length. 
The remaining observables are contained in 


pE} = LR'S*(1 — Io3Io3 + sin(a) (Io}Io} — Io} Io2)) Â. (9.58) 
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To make this result clearer we introduce the notation 
A, = Rlo,R, By, = SIo;,S, (9.59) 
so that 
2pEy = 1 — A} B3 + sin(a)(A}B3 — A1 B?). (9.60) 


The scalar part confirms that the state is normalised correctly. The 4-vector 
part contains an interesting new term, which goes as A} B2 — A1 B?. None of the 
individual A1, A2, B1, or Bz bivectors is accessible to measurement in the single- 
particle case as they are not phase-invariant. But in the two-particle case these 
terms do start to influence the observables. This is one of essential differences 
between classical and quantum models of spin. 


9.3.2 Density matrices and probabilities 


Now that we have all of the observables, we have also found all of the terms in 
the density matrix. Of particular interest are the results of partial traces, where 
we discard the information associated with one of the particles. If we throw out 
all of the information about the second particle, for example, what remains is 
the single-particle density matrix 


p=10 +p), (9.61) 
where the polarisation vector is given by 
p = cos(a)Ro3R. (9.62) 


This vector no longer has unit length, so the density matrix is that of a mixed 
state. Entanglement with a second particle has led to a loss of coherence of 
the first particle. This process, by which entanglement produces decoherence, is 
central to attempts to explain the emergence of classical physics from quantum 
theory. 

For two particles we see that there is a symmetry between the degree of en- 
tanglement. If we perform a partial trace over particle 1, the polarisation vector 
for the second particle also has its length reduced by a factor of cos(a@). More 
generally the picture is less simple, and much work remains in understanding 
entanglement beyond the bipartite case. 

A further application of the preceding is to calculate the overlap probability 
for the inner product of two states. Given two normalised states we have 


P(o, $) = (Ulp)? = tr(Gy Ae). (9.63) 


The degrees of freedom in the density matrices are contained in wEy and YJ y, 
with equivalent expressions for ¢. When forming the inner product between two 
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density matrices, the only terms that can arise are inner products between these 
observables. A little work confirms that we can write, in the n-particle case, 


P(wW, $) = 2"-? (PE) ($E9)) — 2"-?((wIb) (J). (9.64) 


Expressions like this are unique to the geometric algebra approach. The ex- 
pression confirms that once one has found the two multivector observables for a 
state, one has all of the available information to hand. 

As an example, suppose that we are presented with two separable states, w 
and ¢. For separable states we know that the observables take the forms 


2p Jý = A'+B?, WEP =1- AB? (9.65) 
and 

2¢J¢=C'+D*, 26FG=1-C'D?, (9.66) 
where each of the A!, B?, C1 and D? are unit bivectors. We can now write 


Pigg) = U= A'B?)(1 — CD?) — (Al + B°)(C" + D*)) 


4 
=4(1+4-CB-D—A-C—B-D) 
5(1— A-C) $(1— B-D). (9.67) 


This confirms the probability is the product of the separate single-particle prob- 
abilities. If one of the states is entangled this result no longer holds, as we see 
in the following section. 


9.3.3 The singlet state 


As a further example of entanglement we now study some of the properties of 
the non-relativistic spin singlet state. This is 


1 
€) = —=({0)|1) — |1)|0)). 9.68 
le) <q (IPI) — [2 10)) (9.68) 
This is represented in the two-particle spacetime algebra by the multivector 
1 
e = —(Io} — Io?) E. 9.69 
T 2 2) (9.69) 
The properties of € are more easily seen by writing 
e = }(1 + Io} Io2)3 (1 + Io} Io) V2 Io}, (9.70) 


which shows how e contains the commuting idempotents (1 + Io} Io%)/2 and 
(1 + Iø} Io})/2. Identifying these idempotents tells us immediately that 


Io}e = Ł(Io} — Io) $(1 + Io} Io?) /2Io} = —Iore (9.71) 


and 
Io}e = —Io%e. (9.72) 
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If follows that 
Tote = Io} loge = —Io} Io}e = Io} Ioze = —Iote. (9.73) 


Combining these results, if M! is an arbitrary even element in the Pauli algebra 
(M! = Mo + M;Io}), £ satisfies 


M'e = M?e. (9.74) 


Here M! and M? denote the same multivector, but expressed in space 1 or 
space 2. 

Equation (9.74) provides a novel demonstration of the rotational invariance of 
e. Under a joint rotation in two-particle space, a spinor 7 transforms to R! R4, 
where R! and R? are copies of the same rotor but acting in the two different 
spaces. From equation (9.74) it follows that, under such a rotation, € transforms 
as 


em RIRe = R! R'e =e, (9.75) 


so that € is a genuine two-particle rotational scalar. 
If we now form the observables from € we find that 


3 
2eHé=14+S Io} Io} (9.76) 
k=1 
and 
eJé = 0. (9.77) 


The latter has to hold, as there are no rotationally-invariant bivector observables. 
Equation (9.76) identifies a new two-particle invariant, which we can write as 


3 
S Io} lo? = 266 — 1. (9.78) 
k=1 
This is invariant under joint rotations in the two particles spaces. This multi- 
vector equation contains the essence of the matrix result 


3 
5 Chal ô! y= 2057 ôl, am a ôb, (9.79) 


where a, b, a’, b' label the matrix components. In standard quantum mechanics 
this invariant would be thought of as arising from the ‘inner product’ of the spin 
vectors ĉl and G?. Here, we have seen that the invariant arises in a completely 
different way, as a component of the multivector £é. 

The fact that eJ = 0 confirms that the reduced density matrix for either 
particle space is simply one-half of the identity matrix, as established in equa- 
tion (9.22). It follows that all directions are equally likely. If we align our 


measuring apparatus along some given axis and measure the state of particle 1, 
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then both up and down have equal probabilities of 1/2. Suppose now that we 
construct a joint measurement on the singlet state. We can model this as the 
overlap probability between ~ and the separable state 


b= R'S°E. (9.80) 
Denoting the spin directions by 
RIosR =P, Slo =Q, (9.81) 
we find that, from equation (9.64), 


P(Y, p) = (3(1 — P*Q*) 3 (1 + Ioj, Io7,)) 
= P.(Iox) Q:(Iox)) 
— cos(6)) (9.82) 


where 0 is the angle between the spin bivectors P and Q. So, for example, the 
probability that both measurements result in the particles having the same spin 
(@ = 0) is zero, as expected. Similarly, if the measuring devices are aligned, 
the probability that particle 1 is up and particle 2 is down is 1/2, whereas if 
there was no entanglement present the probability would be the product of the 
separate single-particle measurements (resulting in 1/4). 

Some consequences of equation (9.82) run counter to our intuitions about 
locality and causality. In particular, it is impossible to reproduce the statistics 
of equation (9.82) if we assume that the individual particles both know which 
spin state they are in prior to measurement. These contradictions are embodied 
in the famous Bell inequalities. The behaviour of entangled states has now been 
tested experimentally, and the results confirm all of the predictions of quantum 
mechanics. The results are unchanged even if the measurements are performed 
in such a way that the particles cannot be in causal contact. This does not 
provide any conflict with special relativity, as entangled states cannot be used 
to exchange classical information at faster than the speed of light. The reason 
is that the presence of entanglement can only be inferred when the separate 
measurements on the two subsystems are compared. Without knowing which 
measurements observer 1 is performing, observer 2 cannot extract any useful 
classical information from an entangled state. 

For many years the properties of entangled states were explored largely as 
a theoretical investigation into the nature of quantum theory. Now, however, 
physicists are starting to view quantum entanglement as a resource that can be 
controlled in the laboratory. To date our control of entangled states is limited, 
but it is improving rapidly, and many predict that before long we will see the 
first viable quantum computers able to exploit this new resource. 
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9.4 Relativistic states and operators 


The ideas developed for the multiparticle Pauli algebra extend immediately to 
the relativistic domain. A single-particle relativistic state is described by an 
arbitrary even element of the full spacetime algebra. Accordingly, a two-particle 
state is constructed from the tensor product of two such states. This results is a 
space of of 8 x 8 = 64 real dimensions. Post-multiplying the direct-product space 
by the quantum correlator E reduces to 32 real dimensions, which are equiva- 
lent to the 16 complex dimensions employed in standard two-particle relativistic 
quantum theory. All the single-particle operators and observables discussed in 
section 8.2 extend in fairly obvious ways. 
To begin, the individual matrix operators have the equivalent action 


Au DIH) > yr, 


At (9.83) 
1S Fult) o VVN: 


where Î denotes the 4 x 4 identity matrix. The multiparticle spacetime algebra 
operators commute, as they must in order to represent the tensor product. The 
result of the action of 7,7, for example, does not take us outside the two- 
particle state space, since the factor of yj on the right-hand side commutes with 
the correlator E. The remaining matrix operators are easily constructed now, 
for example 


Aw ® Ib) > yhyy. (9.84) 


The role of multiplication by the unit imaginary 7 is still played by right-multi- 
plication by J, and the individual helicity projection operators become 


458 Ilp) > -IYJ = po}. (9.85) 


Relativistic observables are also constructed in a similar manner to the single- 
particle case. We form geometric products wdy, where X is any combination 
of yo and y3 from either space. The result is then guaranteed to be Lorentz- 
covariant and phase-invariant. The first observable to consider is the multivector 


wb = YE} = (WEd)og + (YEY). (9.86) 


The grade-0 and grade-8 terms are the two-particle generalisation of the scalar + 

pseudoscalar combination ww = pexp(iß) found at the single-particle level. The 

4-vector part generalises the entanglement terms found in the non-relativistic 

case. This allows for a relativistic definition of entanglement, which is important 

for a detailed study of the relationship between locality and entanglement. 
Next, we form two-particle current and spin vectors: 


I = (la +%0)8)1, (9.87) 
s = (p +8) ¥)1. 
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(The calligraphic symbol 7 is used to avoid confusion with the correlated bivec- 
tor J.) The full observables will contain grade-1 and grade-5 terms. For direct- 
product states the latter are seen to arise from the presence of a (@ factor in 
either of the single-particle states. Finally, we can also define the spin bivector 
S by 


S = (p Jha. (9.89) 


These expressions show how easy it is to generalise the single-particle formulae 
to the multiparticle case. 


9.4.1 The relativistic singlet state 


In the non-relativistic theory the spin singlet state has a special significance, 
both in being maximally entangled, and in its invariance under joint rotations 
in the two-particle space. An interesting question is whether we can construct 
a relativistic analogue that plays the role of a Lorentz singlet. Recalling the 
definition of £ (9.69), the property that ensured £ was a singlet state was that 


Ioje = —Io%e, k= ey (9.90) 
In addition to (9.90) a relativistic singlet state, which we will denote as 7, must 
satisfy 

oln = —o7n, k=1,...,3. (9.91) 
It follows that 7 satisfies 


I'n = oj0303n = —030305n = In. (9.92) 


For this to hold, 7 must contain a factor of (1—I*I?). We can therefore construct 
a Lorentz single state by multiplying £ by (1 — J'I?), and we define 


n = (lop — Io2) 1 (1 — Io} Io) 1 (1 — FI’). (9.93) 
2 2/93 3 3/2 


This is normalised so that 2(ņn Eñ) = 1. The properties of 7 can be summarised 
as 
M'n = M?n, (9.94) 

where M is an even multivector in either the particle-1 or particle-2 spacetime 
algebra. The proof that 7 is a relativistic invariant now reduces to the simple 
identity 

RR? n = RR yn = n, (9.95) 
where R is a single-particle relativistic rotor. 


Equation (9.94) can be seen as originating from a more primitive relation 


326 


9.4 RELATIVISTIC STATES AND OPERATORS 


between vectors in the separate spaces. Using the result that yy commutes 
with 7, we can derive 


WI = W910 
= Y (Yu) N 
= yan: (9.96) 
For an arbitrary vector a we can now write 
a'nyo = a° nyo. (9.97) 


Equation (9.94) follows immediately from equation (9.97) by writing 


a bin = a'b’ nyayo 
= b'a? nov 
= ban. (9.98) 


Equation (9.97) can therefore be viewed as the fundamental property of the 
relativistic invariant n. 

The invariant 7 can be used to construct a series of observables that are also 
invariant under coupled rotations in the two spaces. The first is 


QnEh = (1 — II’) — (o4 o? — Io} Io). (9.99) 


The scalar and pseudoscalar (grade-8) terms are clearly invariants, and the 4- 
vector term, (0; 0% — Io}, Io?), is a Lorentz invariant because it is a contraction 
over a complete bivector basis in the two spaces. Next we consider the multivec- 
tor 


2n70767 = 10 — Pree — P1076 — vee) 
= (6 — WHR) — IT’). (9.100) 
The essential invariant here is the bivector 
K =n’, (9.101) 


and the invariants from (9.100) are simply K and KI'I?. The bivector K takes 
the form of a ‘doubling’ bivector, which will be encountered again in section 11.4. 
From the definition of K in equation (9.101), we find that 


KAK = 200V VV + (VEE) AGG) 
= 2X(o} o} — Iø} Io?), (9.102) 
which recovers the grade-4 invariant found in equation (9.99). The full set of 
two-particle invariants constructed from K are summarised in table 9.1. These 


invariants are regularly employed in constructing interaction terms in multipar- 
ticle wave equations. 
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Type of 
Invariant interaction Grade 
1 Scalar 0 
K Vector 2 
KAK Bivector 4 
IPK Pseudovector 6 
rr Pseudoscalar 8 


Table 9.1 Relativistic invariants in the two-particle algebra. 


9.4.2 Multiparticle wave equations 


The question of how to construct a valid, relativistic, multiparticle wave equation 
has troubled physicists almost from the moment Dirac proposed his equation. 
The question is far from settled, and the current preferred option is to ignore the 
question where possible and instead work within the framework of perturbative 
quantum field theory. This approach runs into difficulties when analysing bound 
states, however, and for these problems the need for a suitable wave equation 
is particularly acute. The main candidate for a relativistic two-particle system 
is the Bethe-Salpeter equation. Written in the multiparticle spacetime algebra, 
this equation is 


(jVn — m1)(JV2 — ma)¥(r, 8) = T(r, 8)v(r, 8) (9.103) 


where Z(r,s) is an integral operator representing the interparticle interaction, 
and V1 and V? denote vector derivatives with respect to r+ and s? respectively. 
The combined vector 


c=ri+s?= pny + shyt (9.104) 


is the full position vector in eight-dimensional configuration space. 

One slightly unsatisfactory feature of equation (9.103) is that it is not first- 
order. This has led researchers to propose a number of alternative equations, 
typically with the aim of providing a more detailed analysis of two-body bound 
state systems such as the hydrogen atom, or positronium. One such equation is 


(Vier + V2vr78) J = (mi + ma). (9.105) 


As well as being first order, this equation also has the required property that it 
is satisfied by direct products of single-particle solutions. But a problem is that 
any distinction between the particle masses has been lost, since only the total 
mass enters. A second candidate equation, which does keep the masses distinct, 
is 


& HP (a) = U(x) +). R 
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This equation has a number of attractive features, not least of which is that the 
mass enters in a manner that is highly suggestive of gravitational interactions. 
A potential weakness of this equation is that the state space can no longer be 
restricted to sums of direct products of individual states. Instead we have to 
widen the state space to include the entire (correlated) even subalgebra of the 
two-particle spacetime algebra. This doubles the number of degrees of freedom, 
and it is not clear that this doubling can be physical. 

Practically all candidate two-particle wave equations have difficulties in per- 
forming a separation into centre-of-mass and relative coordinates. This is symp- 
tomatic of the fact that the centre of mass cannot be defined sensibly even in 
classical relativistic dynamics. Usually some approximation scheme has to be 
employed to avoid this problem, even when looking for bound state solutions. 
While the question of finding a suitable wave equation remains an interesting 
challenge, one should be wary of the fact that the mass term in the Dirac equa- 
tion is essentially a remainder from a more complicated interaction with the 
Higgs boson. The electroweak theory immediately forces us to consider particle 
doublets, and it could be that one has to consider multiparticle extensions of 
these in order to arrive at a satisfactory theory. 


9.4.38 The Pauli principle 


In quantum theory, indistinguishable particles must obey either Fermi—Dirac 
or Bose-Einstein statistics. For fermions this requirement results in the Pauli 
exclusion principle that no two particles can occupy a state in which their prop- 
erties are identical. The Pauli principle is usually enforced in one of two ways 
in relativistic quantum theory. At the level of multiparticle wave mechanics, 
antisymmetrisation is enforced by using a Slater determinant representation of 
a state. At the level of quantum field theory, however, antisymmetrisation is a 
consequence of the anticommutation of the creation and annihilation operators 
for fermions. Here we are interested in the former approach, and look to achieve 
the antisymmetrisation in a simple geometrical manner. 
We start by introducing the grade-4 multivector 


Ip =Tolilels, (9.107) 
where 
r= 5 (yu +2). (9.108) 
It is a simple matter to verify that Ip has the properties 
I} =-—1 (9.109) 
and 
Ipy, dp =V% Ipyilp = Yi (9.110) 
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It follows that Ip functions as a geometrical version of the particle exchange 
operator. In particular, acting on the eight-dimensional position vector x = 
ri + s? we find that 


IpaIp =r? +8" (9.111) 

where 
r? = yart, s! = ype (9.112) 
So Ip can be used to interchange the coordinates of particles 1 and 2. Next we 


must confirm that Ip is independent of the choice of initial frame. Suppose that 
instead we had started with the rotated frame {Ry,R}, with 


1 
T, = 
C y2 


The new I’, vectors give rise to the rotated 4-vector 


(RiylR + R?42R?) = RUD, BPR. (9.113) 


Ip = RR? IpR?R’. (9.114) 
But, acting on a bivector in particle space 1, we find that 
Ipa' Ab'Ip = —(Ipa' Ip) \(Ipb'Ip) = —a? Ab’, (9.115) 


and the same is true of an arbitrary even element in either space. More generally, 
the operation M — IpMIp applied to an even element in one of the particle 
spaces flips it to the other particle space and changes sign, while applied to an 
odd element it just flips the particle space. It follows that 


IpR?R' = RiIpR) = RR? Ip, (9.116) 


and substituting this into (9.114) we find that J, = Ip. It follows that Ip is 
independent of the chosen orthonormal frame, as required. 

We can now use the 4-vector Ip to encode the Pauli exchange principle geo- 
metrically. Let y(x) be a wavefunction for two electrons. The state 


y(x) = —Ipy(IpzIp)Ip, (9.117) 


then swaps the position dependence, and interchanges the space of the multivec- 
tor components of p. The antisymmetrised state is therefore 


w_(x) = w(x) + Ipy(IpxIp)Ip. (9.118) 


For n-particle systems the extension is straightforward, as we require that the 
wavefunction is invariant under the interchange enforced by the Ips constructed 
from each pair of particles. 

For a single Dirac particle the probability current J = wyow has zero diver- 
gence, and can therefore be used to define streamlines. These are valuable for 
understanding a range of phenomena, such as wavepacket tunnelling and spin 
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measurement. We now illustrate how these ideas extend to the multiparticle 
domain. The two-particle current is 


I = (hl + AO, (9.119) 


as defined in equation (9.87). The vector J has components in both particle-1 
and particle-2 spaces, which we write as 


I= TL+ IR. (9.120) 


For sums of separable solutions to the single-particle equations, the individual 
currents are both conserved: 


Vg =V? -I =o. (9.121) 


It follows that the full current J is conserved in 8-dimensional space, so its 
streamlines never cross there. The streamlines of the individual particles, how- 
ever, are obtained by integrating Jı and J2 in a single spacetime, and these can 
cross if plotted in the same space. For example, suppose that the wavefunction 
is just 

Y = o'(r')x7(s")E, (9.122) 


where ¢ and x are Gaussian wavepackets moving in opposite directions. Since 
the distinguishable case is assumed, no Pauli antisymmetrisation is used. One 
can easily confirm that for this case the streamlines and the wavepackets simply 
pass straight through each other. 

But suppose now that we assume indistinguishability, and apply the Pauli 
symmetrisation procedure to the wavefunction of equation (9.122). We arrive at 
the state 

b= ($ (r) x (s?) — x'(r?)d?(s*)) E, (9.123) 
from which we form Jı and J2, as before. Figure 9.1 shows the streamlines 
that result from these currents. In the left-hand plot both particles are in the 
same spin state. The corrugated appearance of the lines near the origin is the 
result of the streamlines having to pass through a region of highly oscillatory 
destructive interference, since the probability of both particles occupying the 
same position (the origin) with the same spin state is zero. The right-hand 
plot is for two particles in different spin states. Again, the streamlines are seen 
to repel. The reason for this can be found in the symmetry properties of the 
two-particle current. Given that the wavefunction ~ has been antisymmetrised 
according to equation (9.118), the current must satisfy 


IpJ (IpaIp)Ip = I): (9.124) 


It follows that at the same spacetime position, encoded by IpxIp = z in the two- 
particle algebra, the two currents Jı and J are equal. Hence, if two streamlines 
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Figure 9.1 Streamlines for an antisymmetrised two-particle wavefunction. 
The wavefunction is Y = (¢'(r')x?(s?) — x'(r?)¢?(s')) E. The individual 
wavepackets pass through each other, but the streamlines from separate 
particles do not cross. The left-hand figure has both particles with spins 
aligned in the +z direction. The right-hand figure shows particles with 
opposite spins, with ¢@ in the +z direction, and x in the —z direction. 


ever met, they could never separate again. For the simulations presented here, 
the symmetry of the set-up implies that the spatial currents at the origin are 
both zero. As the particles approach the origin, they are forced to slow up. The 
delay means that they are then swept back in the direction they have just come 
from by the wavepacket travelling through from the other side. This repulsion 
has its origin in indistinguishability, and the spin of the states exerts only a 
marginal effect. 


9.5 Two-spinor calculus 


The ideas introduced in this chapter can be employed to construct a geometric al- 
gebra version of the two-spinor calculus developed by Penrose & Rindler (1984). 
The building blocks of their approach are two-component complex spinors, de- 
noted k and ©. Indices are raised and lowered with the antisymmetric tensor 
cap. In the spacetime algebra version both k^ and xa have the same multivector 
equivalent, which we write as 


KA œ KA(1+o3). (9.125) 
The presence of the idempotent (1+o3) /2 allows us to restrict « to the Pauli-even 
algebra, as any Pauli-odd terms can be multiplied on the right by a3 to convert 
them back to the even subspace. This ensures that « has four real degrees of 
freedom, as required. Under a Lorentz transformation the full spinor transforms 
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to 
Rk4(1 +03) = K'4(1 + 03), (9.126) 


where R is a Lorentz rotor. If we decompose the rotor R into Pauli-even and 
Pauli-odd terms, R = R} + R_, then x’ is given by 


K = Rįk + R_kos. (9.127) 


The decomposition into Pauli-even and Pauli-odd terms is frame-dependent, as 
it depends on the choice of the yo direction. But by augmenting « with the 
(1+03)/2 idempotent we ensure that the full object is a proper Lorentz-covariant 
spinor. 

The opposite idempotent, (1 — 073)/2, also generates a valid two-spinor which 
belongs to a second linear space (or module). This is the © spinor in the 
notation of Penrose & Rindler, which we translate to 


o” 6 =-wiez}(1— 93). (9.128) 


The factor of —Io2 is a matter of convention, and is inserted to simplify some of 
the later expressions. Under a Lorentz transformation we see that the Pauli-even 
element w transforms as 


wm= w = Ryw — R_wo3. (9.129) 


So « and w have different transformation laws: they belong to distinct carrier 
spaces of representations of the Lorentz group. 

The power of the two-spinor calculus is the ease with which vector and tensor 
objects are generated from the basic two-spinors. As emphasised by Penrose & 
Rindler, this makes the calculus equally useful for both classical and quantum 
applications. It is instructive to see how this looks from the geometric algebra 
point of view. Unsurprisingly, what we discover is that the two-spinor calculus is 
a highly abstract and sophisticated means of introducing the geometric product 
to tensor manipulations. Once this is understood, much of the apparatus of the 
two-spinor calculus can be stripped away, and one is left with the now familiar 
spacetime algebra approach to relativistic physics. 


9.5.1 Two-spinor observables 


In two-spinor calculus one forms tensor objects from pairs of two-spinors, for 
ARA. To formulate this in the multiparticle spacetime algebra we 
simply multiply together the appropriate spinors, putting each spinor in its own 


example «K 


copy of the spacetime algebra. In this way we replicate the tensor product 
implicit in writing «4R4’. The result is that we form the object 


KARA  —w12(1 + o)n7Io31(1 — 62)1(1 — Io} Io?). (9.130) 
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Fa + 73) 5(1 — 63) E = —3 (90 +y Loaeng 
5(1 — 03)5 (1+ 03) E = —$ (96 — 3) Iozeyg 
La +o) 4(1 +03) B = —1(o4 + Io})e 
3(1— 03)5(1— 03) E = —3(-o} + Ioy)é 


Table 9.2 Two-spinor identities. The identities listed here can be used to 
convert any expression involving a pair of two-spinors into an equivalent 
multivector. 


As it stands this looks rather clumsy, but the various idempotents hide what is 
really going on. The key is to expose the Lorentz singlet structure hidden in the 
combination of idempotents. To achieve this we define two new Lorentz singlet 
states 


e=n3(1 +03), €=3(1— 03), (9.131) 


where y is the Lorentz singlet defined in equation (9.93). These new states both 
satisfy the essential equation 


Mte= Me, Mte= Mz, (9.132) 


where M is an even-grade multivector. The reason is that any idempotents 
applied on the right of 7 cannot affect the result of equation (9.94). Expanding 
out in full, and rearranging the idempotents, we find that 


e= (Io) — Io) 3(1 + 03) (1 + o3)E, (9.133) 
é = Io} — Io?) (1 — 0) (1-0) | 
These relations can manipulated to give, for example, 
Io}e = —(1 + Io} Io) 3 (1 +.03)5(1 +03), (9.134) 
oye = —(1 — Io} 1o})}(1 + o}) (1+ 0$)E 
It follows that 
$(1+03)3(1+03)E = —}(0} + Io})e. (9.135) 


There are four such identities in total, which are listed in table 9.2. 

The results given in table 9.2 enable us to immediately convert any two-spinor 
expression into an equivalent multivector in the spacetime algebra. For example, 
returning to equation (9.130), we form 


-r'r 1095 (1 + 03)3(1— 03)E = Ktk? 30 + 73)€10 


ae ee 
= z (rlo + 78)%) EY- (9.136) 
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The key term in this expression is the null vector «(yo + 73)&, which is con- 
structed in the familiar manner for relativistic observables. A feature of the 
two-spinor calculus is that it lends itself to formulating most quantities in terms 
of null vectors. The origin of these can be traced back to the original (1+ 03) /2 
idempotents, which contain the null vector yo+73. These are rotated and dilated 
onto spacetime null vectors through the application of a spinor. 


9.5.2 The two-spinor inner product 


A Lorentz-invariant inner product for a pair of two-spinors is constructed from 
the antisymmetric combination 


Kw, = —Kqw, + KW, (9.137) 


where the subscripts here denote complex components of a two-spinor. The result 
of the inner product is a Lorentz-invariant complex scalar. The antisymmetry 
of the inner product tells us that we should form the equivalent expression 


(ktw? — Kw") F (1+ 03)3(1+03)E = -i (rlo + loa) — w(o + Ioy)k)'€ 
= —(k(o1 + Io2)0) 6 46. (9.138) 


The antisymmetric product picks out the scalar and pseudoscalar parts of the 
quantity k(0, + Io2)®. This is sensible, as these are the two terms that are 
invariant under Lorentz transformations. 

The fact that we form a scalar + pseudoscalar combination reveals a second 
important feature of the two-spinor calculus, which is that the unit imaginary is 
a representation of the spacetime pseudoscalar. The complex structure therefore 
has a concrete, geometric significance, which is one reason why two-spinor tech- 
niques have proved popular in general relativity, for example. Further insight 
into the form of the two-spinor inner product is gained by assembling the full 
even multivector 


Y = K4(1+ 03) +wlooh(1 — 63). (9.139) 


The essential term in the two-spinor inner product is now reproduced by 


wp = -K1 (1 + 03)Ioo0 + wlogh(1 — 03)k 
= —(K(o1 + Io2)0)o4, (9.140) 
so the inner products pick up both the scalar and pseudoscalar parts of a full 
Dirac spinor product ww. This form makes the Lorentz invariance of the product 


quite transparent. Interchanging k and w in % of equation (9.139) is achieved 
by right-multiplication by o1, which immediately reverses the sign of ww. 
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9.5.3 Spin-frames and the null tetrad 


An important concept in the two-spinor calculus is that of a spin-frame. This 
consists of a pair of two-spinors, k^ and w^ say, normalised such that k4w,4 = 
1. In terms of the spinor ~ of equation (9.139), this normalisation condition 
amounts to saying that w satisfies ow = 1. A normalised spin-frame is therefore 
the two-spinor encoding of a spacetime rotor. This realisation also sheds light 
on the associated concept of a null tetrad. In terms of the spin frame {k4,w}, 
the associated null tetrad is defined as follows: 


1° = KARA œ («(70 T 13) F) Ey, 
à _ Al ~l 
nt = whi’ e (wlio +79)0) E, aaah 
a AA! ~\lo4 j 
m =R a (r(Y + 73) ) Yo 
ae , AT 
M =w*K* > (w(yo +73)%) Ey. 


In each case we have projected into a single copy of the spacetime algebra to 
form a geometric multivector. To simplify these expressions we introduce the 
rotor R defined by 


It follows that 


Rin + Iy2)R = Kyi (1 + o3)loow 
= K(Yo + 73). (9.143) 


The null tetrad induced by a normalised spin-frame can now be written in the 
spacetime algebra as 


l= R(% +73)R, m = Ry + In2)R, 
(70 7s) (1 Y2)R (9.144) 
n = R(yo — 73)R, m = R(qı — Iy2)R. 


(One can chose alternative normalisations, if required). The complex vectors m“ 
and m* of the two-spinor calculus have now been replaced by vector + trivector 
combinations. This agrees with the earlier observation that the imaginary scalar 
in the two-spinor calculus plays the role of the spacetime pseudoscalar. The 
multivectors in a null tetrad satisfy the anticommutation relations 


{il,n}=4, {m,m}=4, all others = 0. (9.145) 


These relations provide a framework for the formulation of supersymmetric quan- 
tum theory within the multiparticle spacetime algebra. 
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9.6 Notes 


The multiparticle spacetime algebra was introduced in the paper ‘States and 
operators in the spacetime algebra’ by Doran, Lasenby & Gull (1993a). Since its 
introduction the multiparticle spacetime algebra has been developed by a range 
of researchers. For introductions see the papers by Parker & Doran (2002) and 
Havel & Doran (2000a,2002b). Of particular interest are the papers by Somaroo 
et al. (1998,1999) and Havel et al. (2001), which show how the multiparticle 
spacetime algebra can be applied to great effect in the theory of quantum infor- 
mation processing. These researchers were primarily motivated by the desire to 
create quantum gates in an NMR environment, though their observations can 
be applied to quantum computation in general. For a good introduction into the 
subject of quantum information, we recommend the course notes made available 
by Preskill (1998). 

The subject of relativistic multiparticle quantum theory has been tackled by 
many authors. The most authoritative discussions are contained in the papers 
by Salpeter & Bethe (1951), Salpeter (1952), Breit (1929) and Feynman (1961). 
A more modern perspective is contained in the discussions in Itzykson & Zu- 
ber (1980) and Grandy (1991). For more recent attempts at constructing a 
two-particle version of the Dirac equation, see the papers by Galeao & Ferreira 
(1992), Cook (1988) and Koide (1982). A summary of the multiparticle space- 
time algebra approach to this problem is contained in Doran et al.(1996b). 

The two-spinor calculus is described in the pair of books ‘Spinors and Space- 
time’ volumes I and II by Penrose & Rindler (1984,1986). The spacetime algebra 
version of two-spinor calculus is described in more detail in ‘Geometric algebra 
and its application to mathematical physics’ by Doran (1994), with additional 
material contained in the paper ‘2-spinors, twistors and supersymmetry in the 
spacetime algebra’ by Lasenby et al. (1993b). The conventions adopted in this 
book differ slightly from those adopted in many of the earlier papers. 


9.7 Exercises 


9.1 Explain how the two-particle Schrödinger equation for the Coulomb 
problem is reduced to the effective single-particle equation 


wv? 192 
=f 
2u v tee Y 


where p is the reduced mass. 
9.2 Given that Y(0, $) = exp(—¢Io3/2) exp(—@Io2/2), prove that 


( sin(6/2)e~*?/2 


— ene) = y0, pIo. 


Confirm that this state is orthogonal to w(6, ¢). 
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9.3 


9.4 


9.5 


9.6 


9.7 


9.8 


The interaction energy of two dipoles is given classically by 


Ko (Mi He My? Ho? 
Beg ) 
4r r3 3 r5 


where u; denotes the magnetic moment of particle i. For a quantum 
system of spin 1/2 particles we replace the magnetic moment vectors 
with the operators ji, = (yh /2)o,. Given that n = r/r, show that the 
Hamiltonian operator takes the form of the 4-vector 


3 
H= -< (>: Io} Io? — 3 Int më) 


and find an expression for d. Can you solve the two-particle Schrödinger 
equation with this Hamiltonian? 

yw and @¢ are a pair of non-relativistic multiparticle states. Prove that 
the overlap probability between the two states can be written 


(WEYSES) — ((WJd)(bF0)) 
2(wEd) (bE 9) 


Investigate the properties of the l = 1, m = 0 state 
|) = [0)11) + [1)|0). 


Is this state maximally entangled? 


P(4,¢) = 


The Ø, operators that act on states in the two-particle relativistic alge- 
bra are defined by: 


Balh) = Elib + yato). 


Verify that these operators generate the Duffin-Kemmer ring 


BuBv Bp a By Bv By = Mvpbu + NuypBp- 


The multiparticle wavefunction ~ is constructed from superpositions 
of states of the form $1(r')x?(s?), where ¢ and y satisfy the single- 
particle Dirac equation. Prove that the individual currents J} and J 
are conserved, where 


It als Ts = (b(% + 96)0)1. 


In the two-spinor calculus the two-component complex vector x^ is acted 


on by a 2 x 2 complex matrix R . Prove that R is a representation of the 
Lorentz rotor group if det R = 1. (This defines the Lie group SI(2,C).) 
Hence establish that the antisymmetric combination Kw! — ktw? is a 
Lorentz scalar. 
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9.9 


9.10 


The two-spinor calculus version of the Dirac equation is 


A'A _A’ 
VO ČRKA = UO", 


VAA Gar = ur, 
where pp = m/ V2. Prove that these equations are equivalent to the 
single equation Vwlo3 = muyo and give an expression for p in terms 
of kô and ya). 


A null tetrad is defined by the set 


l= R(% + 73)R, m = R(j + In2)R, 
n= R(% - %3)Ř, m= Ry — Iy)Ř. 
Prove that these satisfy the anticommutation relations 


{ln} =4, {m,m} = 4, all others = 0. 
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Geometry 


In the preceding chapters of this book we have dealt entirely with a single geomet- 
ric interpretation of the elements of a geometric algebra. But the relationship 
between algebra and geometry is seldom unique. Geometric problems can be 
studied using a variety of algebraic techniques, and the same algebraic result 
can typically be pictured in a variety of different ways. In this chapter, we 
explore a range of alternative geometric systems, and discover how geometric 
algebra can be applied to each of them. We will find that there is no unique 
interpretation forced on the multivectors of a given grade. For example, to date 
we have viewed bivectors solely as directed plane segments. But in projective 
geometry a bivector represents a line, and in conformal geometry a bivector can 
represent a pair of points. 

Ideas from geometry have always been a prime motivating factor in the de- 
velopment of mathematics. By the nineteenth century mathematicians were 
familiar with affine, Euclidean, spherical, hyperbolic, projective and inversive 
geometries. The unifying framework for studying these geometries was provided 
by the Kleinian viewpoint. Under this view a geometry consists of a space of 
points, together with a group of transformations mapping the points onto them- 
selves. Any property of a particular geometry must be invariant under the action 
of the associated symmetry group. Klein was thus able to unite various geome- 
tries by describing how some symmetry groups are subgroups of larger groups. 
For example, Euclidean geometry is a subgeometry of affine geometry, because 
the group of Euclidean transformations is a subgroup of the group of affine trans- 
formations. 

In this chapter we will see how the various classical geometries, and their 
associated groups, are handled in geometric algebra. But we will also go further 
by addressing the question of how to represent various geometric primitives in 
the most compact and efficient way. The Kleinian viewpoint achieves a united 
approach to classical geometry, but it does not help much when it comes to 
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addressing problems of how to perform calculations efficiently. For example, 
circles are as much geometric primitives in Euclidean geometry as points, lines 
a planes. But how should circles be represented as algebraic entities? Storing 
a point and a radius is unsatisfactory, as this representation involves objects of 
different grades. In this chapter we answer this question by showing that both 
lines and circles are represented as trivectors in the conformal model of Euclidean 
geometry. 

We begin with the study of projective geometry. The addition of an extra 
dimension allows us to create an algebra of incidence relations between points, 
lines and planes in space. We then return to Euclidean geometry, but rather 
than viewing this as a subgeometry of projective geometry (the Kleinian view- 
point), we will instead increase the dimension once more to establish a conformal 
representation of Euclidean geometry. The beauty of this construction is that 
the group of Euclidean transformations can now be formulated as a rotor group. 
Euclidean invariants are then constructed as inner products between multivec- 
tors. This framework allows us to extend the projective treatment of incidence 
relations to include circles and spheres. 

A further attractive feature of the conformal model is that Euclidean, spherical 
and hyperbolic geometries are all handled in the same framework. This allows 
the Poincaré disc model of non-Euclidean geometry in the plane to be extended 
seamlessly to higher dimensions. Of particular importance is the clarification 
of the role of complex coordinates in planar non-Euclidean geometry. Much 
of their utility rests on features of the conformal group of the plane that do 
not extend naturally. Instead, we work within the framework of real geometric 
algebra to obtain results which are independent of dimension. Finally in this 
chapter we turn to spacetime geometry. The conformal model for spacetime is of 
considerable importance in formulations of supersymmetric theories of gravity, 
and also lies at the heart of the twistor program. We display some surprising 
links between these ideas and the multiparticle spacetime algebra described in 
chapter 9. Throughout this chapter we denote the vector space with signature 
p,q by V(p,q), and the geometric algebra of this space by G(p, q). 


10.1 Projective geometry 


There was a time when projective geometry formed a large part of undergraduate 
mathematics courses. For various reasons the subject fell out of fashion in the 
twentieth century, making way for the more relevant subject of differential geom- 
etry. But in recent years projective geometry has enjoyed a resurgence due to its 
importance in the computer graphics industry. For example, the routines at the 
core of the OpenGL graphics language are built on a projective representation 
of three-dimensional space. 

The key idea in projective geometry is that points in space are represented as 


341 


GEOMETRY 


Figure 10.1 Projective geometry. Points in the projective plane are repre- 
sented by vectors in a space one dimension higher. The plane II does not 
intersect the origin 0. 


vectors in a space of one dimension higher. For example, points in the projective 
plane are represented as vectors in three-dimensional space (see figure 10.1). The 
magnitude of the vector is unimportant, as both a and Aa represent the same 
point. This representation of points is said to be homogeneous. The two key 
operations in projective geometry are the join and meet. The join of two points, 
for example, is the line between them. Forming the join raises the grade, and 
the join can usually be encoded algebraically via the exterior product (this was 
Grassmann’s original motivation for introducing his exterior algebra). The meet 
is used for forming intersections, such as two lines in a plane meeting at a point. 
The meet is traditionally encoded via the notion of duality, and in geometric 
algebra the role of the meet is played by the inner product. Operations such 
as the meet and join do not depend on the metric, so in projective geometry 
we have a non-metric interpretation of the inner product. This is an important 
point. Some authors have argued that, because geometric algebra is built on a 
quadratic form, it is intimately tied to metric geometry. This view is incorrect, 
as we demonstrate below. 


10.1.1 The projective line 


The simplest place to start is with a one-dimensional line. The ‘Euclidean’ 
model of the line consists of labelling each point with a real number. But there 
are drawbacks with this representation of a line. Geometrically, all points on the 
line are equal. But algebraically there are two exceptional points on the line. 
The first is the origin, which is represented by the algebraically special number 
zero. The second is the point at infinity, which becomes important when we start 
to consider projective transformations. The resolution of both of these problems 
is to represent points in the line as vectors in two-dimensional space. In this way 
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the point x is replaced by a pair of homogeneous coordinates (£1, £2), with 

Tı 

E= 10.1 

n (10.1) 
One can immediately see that the origin is represented by the non-zero vector 
(0,1), and that the point at infinity is (1,0). 

If the vectors {e1, €2} denote an orthonormal frame for two-dimensional space, 

we can set 


£T = T161 + £22. (10.2) 


The set of all non-zero vectors æ constitute the projective line, RP!. The fact 
that the origin is excluded implies that in projective spaces one loses linear- 
ity. This is obvious from the fact that x and Aa represent the same point, so 
linear combinations do not make geometric sense. Indeed, no geometric signifi- 
cance can be attached to the addition of two points in projective geometry. One 
cannot form midpoints, for example, as distances and angles are not projective 
invariants. 

The projective group consists of the group of general linear transformations 
applied to vectors in projective space. For the case of the projective line this 
group is defined by transformations of the form 


Ly a b\ (x ax, + bre 
= b—b i 10. 
a E l i) a & a > pS eo) 
In terms of points on the line, this transformation corresponds to 


, at+b 

T> r = : 

c“e+d 

The group action includes dilations, inversions and translations. The last are 

obtained for the case c = 0, a/d = 1. The fact that translations become lin- 

ear transformations in projective geometry is of considerable importance. In 

three-dimensional geometry, for example, both rotations and translations can be 

encoded as 4 x 4 matrices. While this may appear to be an overly-complicated 

representation, it makes stringing together a series of translations and rotations 

a straightforward exercise. This is important in computer graphics, and is the 
representation employed in all OpenGL routines. 

In geometric algebra notation we write a general linear transformation as the 
map x +> f(a), where det (f) Æ 0. Valid geometric statements in projective 
geometry must be invariant under such transformations, which is a strong re- 
striction. Inner products between projective vectors (points) are clearly not 
invariant under projective transformations. The outer product does transform 
sensibly, however, due to the properties of the outermorphism. For example, 
suppose that the points a and ĝ are represented projectively by 


(10.4) 


a = &eı +e2, b= Ge, +e. (10.5) 
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O 


Figure 10.2 The cross ratio. Points on the lines L and L’ represent two 
different projective views of the same vectors in space. The cross ratio of 
the four points is the same on both lines. 


The outer product of these is 
ab = (a — p)e1^e2, (10.6) 


which is controlled by the distance between the points on the line. Under a 
projective transformation in two dimensions 


e1^e2 > f(e1^e2) = det (f) e1^e2, (10.7) 


which is just an overall scaling. 

The fact that distances between points are scaled under a projective transfor- 
mation provides us with an important projective invariant for four points on a 
line. This is formed from ratios of lengths along a line. We must further ensure 
that the ratio is invariant under individual rescaling of individual vectors to be 
a true projective invariant. We therefore define the cross ratio of four points, A, 
B, C, D, by 


AC BD adAcbAd 
een BC AD bAcadd’ 


where AB denotes the distance between A and B. Given any four points on 
a line, their cross ratio is a projective invariant (see figure 10.2). The figure 
illustrates one possible geometric interpretation of a projective transformation, 
which is that the line onto which points are projected is transformed to a new line. 
Invariants such as the cross ratio are important in computer vision where, for 
example, we seek to extract three-dimensional information from a series of two- 


(10.8) 
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dimensional scenes. Knowledge of invariants can help establish point matches 
between the scenes. 


10.1.2 The projective plane 


Rather more interesting than the case of a line is that of the projective plane. 
Points in the plane are now represented by vectors in the three-dimensional 
algebra G(3,0). Figure 10.1 shows that the line between the points a and b is 
the result of projecting the plane defined by a and b onto the projective plane. 
We therefore define the join of the points a and b by 


join(a, b) = aAb. (10.9) 


Bivectors thus define lines in projective geometry. The line itself is recovered 
by solving the equation 


aNbAa = 0. (10.10) 


This equation is solved by 
x= Aa + pb, (10.11) 


which defines the set of projective points on the line joining A and B. 

By taking exterior products of vectors we define (projectively) higher dimen- 
sional objects. For example, the join of a point a and a line b/c is the plane 
defined by the trivector aAbAc. Three points on a line cannot define a projected 
area, so for these we must have 


a\b\c=0 => a, b, c collinear. (10.12) 


This was the condition used to recover the points x on the line aAb. The join 
itself can be slightly more problematic. Given three points one cannot just write 
that their join is aAbAc, as the result may be zero. Instead the join is defined as 
the smallest subspace containing a, b and c. If they are collinear, then the join 
is the common line. This is well defined mathematically, but is hard to encode 
computationally. The problem is that the finite precision used on computers 
means that testing for zero is unreliable. Wherever possible it is safer to avoid 
defining the join and instead work with the exterior product. 

Projective geometry deals with relationships that are invariant under projec- 
tive transformations. The join is one such concept — as two points are trans- 
formed the line joining them transforms in the obvious way: 


ab f(a)Af(b) = f(a ^b). (10.13) 


So, for example, the statement that three points lie on a line (aAbAc = 0) is 
unchanged by a projective transformation. Similarly, the statement that three 
lines intersect at a point must also be a projective invariant. We therefore seek 


345 


GEOMETRY 


an algebraic encoding of the intersection of two lines. This is the called the meet, 
usually denoted with the V symbol. Before we can encode this, however, we need 
to define the dual. In the projective plane, points and lines are represented as 
vectors and bivectors in G(3,0). We know that these can be interchanged via 
a duality transformation, which amounts to multiplying by the pseudoscalar I. 
In this way every point has a dual line, and vice versa. The geometric picture 
associated with duality depends on the embedding plane. 

If we denote the dual of A by A*, the meet AV B is defined by the ‘de Morgan’ 
rule 


(AV B)* = A*AB*. (10.14) 
For a pair of lines in a plane, this amounts to 
AV B=-—I(IA)AUIB) =I AxB = A- (IB) = (IA)-B. (10.15) 


These formulae show how the inner product can be used to encode the meet, 
without imposing a metric on projective space. The expression 


AVB=IAxB (10.16) 


shows how the construction works. In three dimensions, Ax B is the plane per- 
pendicular to A and B, and I AxB is the line perpendicular to this plane, through 
the origin. This is therefore the line common to both planes, so projectively gives 
the point of intersection of two lines. 

The meet of two distinct lines in a plane always results in a non-zero point. 
If the lines are parallel then their meet returns the point at infinity. Parallelism 
is not a projective invariant, however, so under a projective transformation two 
parallel lines can transform to lines intersecting at a finite point. This illustrates 
the fact that the point at infinity does not necessarily stay at infinity under 
projective transformations. It is instructive to see how the meet itself transforms 
under a projective transformation. Using the results of section 4.4, we find that 


AV Be f(A) V f(B) = I (If(A)) A (If(B)) 


= det (f)? If~1(1A) Af~* (IB) 
= det (f)? If (([A)A(/B)) 
= det (f) f(I (LA)A(IB)). (10.17) 


We can summarise this result as 
f(A) V f(B) = det (f) f(A V B). (10.18) 


But in projective geometry, a and Aa represent the same point, so the factor of 
det (f) does not affect the resulting point. This confirms that under a projective 
transformation the meet transforms as required. 
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Figure 10.3 Desargues’ theorem. The lines P,Q, R meet at a point if and 
only if the points p, q,r lie on a line. The two triangles are then projectively 
related. 


The condition that three lines meet at a common point requires that the meet 
of two lines lies on a third line, which goes as 


(AV B)AC = (IAxB)AC =0. (10.19) 
Dualising this result we obtain the condition 
((Ax B)C) = (ABC) =0, = A, B, C coincident. (10.20) 


This is an extremely simple algebraic encoding of the statement that three lines 
(represented by bivectors) all meet at a common point. Equations like this 
demonstrate how powerful geometric algebra can be when applied in a projective 
setting. 

As an application consider Desargues’ theorem, which is illustrated in fig- 
ure 10.3. The points a, b, c and a’, b', c’ define two triangles. The associated 
lines are defined by 


A=bAc, B=cAa, C=aAb, (10.21) 


with the same definitions holding for A’, B’, C’ in terms of a’, b’,c’. The two sets 
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of vertices determine the lines 
P=a^d, Q=bAv'", R=cAd, (10.22) 
and the two sets of lines determine the points 
p= AXxA'l, q=BxBl, r=CxC'l. (10.23) 


Desargues’ theorem states that, if p, q, r lie on a common line, then P, Q and 
R all meet at a common point. The latter condition requires 


(PQR) = (a^a DAU cAc) = 0. (10.24) 
Similarly, for p,q,r to fall on a line we form 


pAqhr = (Ax A'I BxB'ICxC'l)3 
= —I(Ax A’ Bx B'CxC’). (10.25) 


Desargues’ theorem is then proved by the algebraic identity 
(aAbAca’ Ab’ Ac’) land bab cac) = (Ax A’ Bx B'CxC"’), (10.26) 


the proof of which is left as an exercise. The left-hand side vanishes if and only 
if the lines P, Q, R meet at a point. The right-hand side vanishes if and only if 
the points p, q, r lie on a line. This proves the theorem. The complex geometry 
illustrated in figure 10.3 has therefore been reduced to a straightforward algebraic 
identity. 

We can find a simple generalisation of the cross ratio for the case of the projec- 
tive plane. From the derivation of the cross ratio, it is clear that any analogous 
object for the plane must involve ratios of trivectors. These represent areas in 
the projective plane. For example, suppose we have six points in space with 
position vectors aj,...,ag. These produce the six projected points Ay,..., Ag. 
An invariant is formed by 


as /\a4/\a3 ag\a2\a1 _ A543 Agar (10.27) 


asa, a3 ag^a2^a4 A513 A624 


where Aj; is the projected area of the triangle with vertices A;, Aj, Ak. Again, 
elementary algebraic reasoning quickly yields a geometrically significant result. 


10.1.8 Homogeneous coordinates and projective splits 


In typical applications of projective geometry we are interested in the relationship 
between coordinates in an image plane (for example in terms of pixels relative to 
some origin) and the three-dimensional position vector. Suppose that the origin 
in the image plane is defined by the vector n, which is perpendicular to the plane. 
The line on the image plane from the origin to the image point is represented by 
the bivector a^n (see figure 10.4) . The vector OA belongs to a two-dimensional 
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Image plane 


Figure 10.4 The image plane. Vectors in the image plane, OA, are de- 
scribed by bivectors in G(3,0). The point A can be expressed in terms of 
homogeneous coordinates in the image plane. 


geometric algebra. We can relate this directly to the three-dimensional algebra 
by first writing 


n+OA=)da. (10.28) 


Contracting with n, we find that A = n?(a-n)~+. It follows that 


n?— ann ann 


O0OA=2 = h. (10.29) 


an acn 


If we now drop the final factor of n, we obtain a bivector that is homogeneous 
in both a and n. In this way we can directly represent the line OA in two 
dimensions with the bivector 


=“, (10.30) 


This is the projective split, first introduced in chapter 5 as a means of relating 
physics as seen by observers with different velocities. 

The map of equation (10.30) relates bivectors in a higher dimensional space 
to vectors in a space of dimension one lower. If we introduce a coordinate frame 
{e;}, with es in the n direction, we see that the coordinates of the image of 
a= aje; are 

a1 a2 
A= —e1e3 + — €263 = AE: + AE. (10.31) 
a3 a3 
This equation defines the homogeneous coordinates A;: 
Qi 


A=. (10.32) 
a3 
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Homogeneous coordinates are independent of scale and it is these that are usu- 
ally measured in a camera projection of a scene. The bivectors (E1, E2) act as 
generators for a two-dimensional geometric algebra. If the vectors in the pro- 
jective space are all Euclidean, the E; bivectors will have negative square. If 
necessary, this can be avoided by letting e3 be an anti-Euclidean vector. The 
projective split is an elegant scheme for relating results in projective space to 
Euclidean space one dimension lower. Algebraically, the projective split rests on 
the isomorphism 


G+(p +1,4) = G(q, p). (10.33) 


This states that the even subalgebra of the geometric algebra with signature 
(p+ 1,q) is isomorphic to the algebra with signature (q, p). The projective split 
is not always the best way to map from projective space back to Euclidean space, 
however, as constructing a set of bivectors can be an unnecessary complication. 
Often it is simpler to choose an orthonormal frame, with n one of the frame 
vectors, and then scale all vectors x such that n -x = 1. 


10.1.4 Projective geometry in three dimensions 


To handle complicated three-dimensional problems in a projective framework 
we require a four-dimensional geometric algebra. The basic elements of four- 
dimensional geometric algebra will be familiar from relativity and the spacetime 
algebra, though now the elements are given a projective interpretation. The 
algebra of a four-dimensional space contains six bivectors, which represent lines 
in three dimensions. As in the planar case, the important feature of the projective 
framework is that we are free from the restriction that all lines pass through the 
origin. The line through the points a and b is again represented by the bivector 
ab. This is a blade, as must be the case for any bivector representing a line. 
Any bivector blade B = a^b must satisfy the algebraic condition 


BAB = a^b^a^b = 0, (10.34) 


which removes one degree of freedom from the six components needed to specify 
an arbitrary bivector. This is known at the Plücker condition. If the vector e4 
defines the projection into Euclidean space, the line a^b has coordinates 


anb = (a + e4)^A(b + 4) = a^b + (a — b) ^es, (10.35) 


where a and b denote vectors in the three-dimensional space. The bivector B 
therefore encodes a line as a combination of a tangent (b — a) and a moment 
a/b. These are the Plücker coordinates for a line. 

Given two lines as bivectors B and B’, the test that they intersect in three 
dimensions is that their join does not span all of projective space, which implies 
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that 
BAB' =0. (10.36) 


This provides a projective interpretation for commuting bivectors in four dimen- 
sions. Commuting (orthogonal) bivectors have BB’ equalling a multiple of the 
pseudoscalar. Projectively, these can be interpreted as two lines in three dimen- 
sions that do not share a common point. As mentioned earlier, the problem with 
a test such as equation (10.36) is that one can never guarantee to obtain zero 
when working to finite numerical precision. In practice, then, one tends to avoid 
trying to find the intersection of two lines in the three dimensions, unless there 
is good reason to believe that they intersect at a point. 

The exterior product of three vectors in projective space results in the trivector 
encoding the plane containing the three points. One of the most frequently 
encountered problems is finding the point of intersection of a line L and a plane 
P. This is given by 

x= P. (IL), (10.37) 


where I is the four-dimensional pseudoscalar. This will always return a point, 
provided the line does not lie entirely in the plane. Similarly, the intersection of 
two planes in three dimensions must result in a line. Algebraically, this line is 
encoded by the bivector 


L = (IP) Po = I P, x Po, (10.38) 


where P; and P» are the two planes. Such projective formulae are important in 
computer vision and graphics applications. 


10.2 Conformal geometry 


Projective geometry does provide an efficient framework for handling Euclidean 
geometry. Euclidean geometry is a subgeometry of projective geometry, so any 
valid result in the latter must hold in the former. But there are some limitations 
to the projective viewpoint. Euclidean concepts, like lengths and angles, are 
not straightforwardly encoded, and the related concepts of circles and spheres 
are equally awkward. Conformal geometry provides an elegant solution to this 
problem. The key is to introduce a further dimension of opposite signature, 
so that points in a space of signature (p,q) are modelled as null vectors in a 
space of signature (p + 1,q + 1). That is, points in V(p,q) are represented by 
null vectors in V(p + 1,q + 1). Projective geometry is retained as a subset of 
conformal geometry, but the range of geometric primitives is extended to include 
circles and spheres. 

We denote a point in V(p,q) by x, and its conformal representation by X. We 
continue to employ the spacetime notation of using the tilde symbol to denote 
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Figure 10.5 A stereographic projection. The line is mapped into the unit 
circle, so the points on the line zı and x2 are mapped to the unit vectors 
fı and f2. The origin and infinity are mapped to opposite points on the 
circle. 


the reverse operation for a general multivector in any geometric algebra. A basis 
set of vectors for G(p, q) is denoted by {e;}, and the two additional vectors {e, €} 
complete this to an orthonormal basis for G(p + 1,q + 1). 


10.2.1 Stereographic projection of a line 


We illustrate the general construction by starting with the simple case of a line. 
In projective geometry points on a line are modeled as two-dimensional vectors. 
The conformal model is established from a slightly different starting point, using 
the stereographic projection. Under a stereographic projection, points on a line 
are mapped to the unit circle in a plane (see figure 10.5). Points on the unit 
circle in two dimensions are represented by 


f? = cos(@) e1 + sin(6) e2. (10.39) 
The corresponding point on the line is given by 


cos(@) 


= TF (10.40) 


This relation inverts simply to give 
2x 1 — x? 


a in(@) = ——.. 
14+ a?’ salg) 1+ x? 


So far we have achieved a representation of the line in terms of a circle in two 


cos(9) (10.41) 


dimensions. But the constraint that the vector has unit magnitude means that 
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we have lost homogeneity. To get round this we introduce a third vector, €, 
which has negative signature, 


a = -1, (10.42) 


and we assume that ē is orthogonal to eı and e2. We can now replace the unit 
vector 7 with the null vector X, where 


2x 1- z? 


X = cos(0) e1 + sin(0) eg + € = Izt Lia 


até. (10.43) 


The vector X satisfies X? = 0, so is null. 

The equation X? = 0 is homogeneous. If it is satisfied for X, it is satisfied 
for AX. We can therefore move to a homogeneous representation and let both 
X and AX represent the same point. Multiplying by (1 + x?) we establish the 
conformal representation 


X = 2re, + (1—2*)eg + (1 + 27)é. (10.44) 


This is the basic representation we use throughout. To establish a more general 
notation we first replace the vector e2 by —e. We therefore have 


e=1, @=-1, eē=0. (10.45) 


The vectors e and é are then the two extra vectors that extend the space V(p, q) 
to V(p+1,q+ 1). Frequently, it is more convenient to work with a null basis for 
the extra dimensions. We define 


n=et+é, n=e-€. (10.46) 

These vectors satisfy 
n? =n =0, mn = 2. (10.47) 

The vector X is now 
X = 22e, +r n- in. (10.48) 


It is straightforward to confirm that this is a null vector. The set of all null 
vectors in this space form a cone, and the real number line is modelled by the 
intersection of this cone and a plane. The construction is illustrated in figure 10.6. 


10.2.2 Conformal model of Euclidean space 


The form of equation (10.48) generalises easily. If x is an element of V(p, q), we 
set 


F(a) = X =2?n+22—-A, (10.49) 
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Figure 10.6 The conformal model of a line. Points on the line are repre- 
sented by null vectors in three dimensions. These lie on a cone, and the 
intersection of the cone with a plane recovers the point. 


which is a null vector in V(p + 1,q + 1). This vector can be obtained simply via 
the map, 


F(x) = —(a — e)n(x — e), (10.50) 


which is a reflection of the null vector n in the plane perpendicular to (x — e). 
The result must therefore be a new null vector. The presence of the vector e 
removes any ambiguity in handling the origin x = 0. The map F(x) is non-linear 
so, as with projective geometry, we move to a non-linear representation of points 
in conformal geometry. 

More generally, any null vector in V(p + 1,q + 1) can be written as 


X = \2?n + 22 — ñ), (10.51) 


with A a scalar. This provides a projective map between V(p + 1,q + 1) and 
V(p,q). The family of null vectors, A(x?°n + 2x —7), in V(p+1,q+1) correspond 
to the single point x € V(p,q). Given an arbitrary null vector X, it is frequently 
useful to convert it to the standard form of equation (10.49). This is achieved 
by setting 
X 

X= A in (10.52) 
This map is similar to that employed in constructing a standard embedding in 
projective geometry. The status of the vector n is clear here — it represents the 
point at infinity. 
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Given two null vectors X and Y, in standard form, their inner product is 


X:-Y= (a?n + 2a —7n)-(y?n + 2y — 7) 
= —2n* — 2y? + da-y 
= —2(x —y)*. (10.53) 


This result is of fundamental importance to the conformal model of Euclidean 
geometry. The inner product in conformal space encodes the distance between 
points in Euclidean space. It follows that any transformation of null vectors 
in V(p + 1,q + 1) which leaves inner products invariant can correspond to a 
transformation in V(p,q) which leaves angles and distances invariant. In the 
next section we discuss these transformations in detail. 


10.3 Conformal transformations 


The study of the main geometric primitives in conformal geometry is simpli- 
fied by first understanding the nature of the conformal group. For points x,y in 
V(p, q) the definition of a conformal transformation is that it leaves angles invari- 
ant. So, if f is a map from V(p, q) to itself, then f is a conformal transformation 
if 

f(a)-f(b) = Aa-b, Va,b € V(p,q), (10.54) 
where 


f(a) =a-Vf (2). (10.55) 


While f(a) is a linear map at each point x, the conformal transformation f(x) 
is not restricted to being linear. Conformal transformations form a group, the 
conformal group, the main elements of which are translations, rotations, dilations 
and inversions. We now study each of these in turn. 


10.3.1 Translations 


To begin, consider the fundamental operation of translation in the space V(p, q). 
This is not a linear operation in V(p,q), but does become linear in the pro- 
jective framework. In the conformal model we achieve a further refinement, as 
translations can now be handled by rotors. Consider the rotor 


R=T,=e"/?, (10.56) 


where a E€ V(p, q), so that a-n = 0. The generator for the rotor is a null bivector, 
so the Taylor series for T, terminates after two terms: 


T,=1+ oe (10.57) 
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The rotor T, transforms the null vectors n and ^ into 
Tanl, =n+ inan + nan + {nanan =n (10.58) 
and 
Tañl, = ñ — 2a — a°n. (10.59) 
Acting on a vector x € V(p,q) we similarly obtain 
T,2T, = 2+n(a-2). (10.60) 
Combining these we find that 
TF (a)Ty = 2?n + 2(a + aen) — (ñ — 2a — an) 
= (x +a) n+2(x+a)-ñ 


= F(x +a), (10.61) 


which performs the conformal version of the translation xz + x +a. Translations 
are handled as rotations in conformal space, and the rotor group provides a 
double-cover representation of a translation. The identity 


Ta = Ta (10.62) 
ensures that the inverse transformation in conformal space corresponds to a 
translation in the opposite direction, as required. 


10.3.2 Rotations 


Next, suppose that we rotate the vector x about the origin in V(p,q). This is 
achieved with the rotor R € G(p,q) via the familiar transformation z > 2! = 
RxR. The image of the transformed point is 


F(a’) =x°n+ 2RR- ñ 
= R(a?n + 2x — n)R = RF (x£) RÈ. (10.63) 
This holds because R is an even element in G(p, q), so must commute with both 
nand ñ. Rotations about the origin therefore take the same form in either space. 
Suppose instead that we wish to rotate about the point a € V(p, q). This can 


be achieved by translating a to the origin, rotating and then translating forward 
again. In terms of X = F(x) the result is 


XH T,RT_,XT_,RT, = R'XR. (10.64) 
The rotation is now controlled by the rotor 
R' =T,RT, = (1 i =) R (1 + =) l (10.65) 


So, as expected, the conformal model has freed us from treating the origin as a 
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special point. Rotations about any point are handled in the same manner, and 
are still generated by a bivector blade. Similar observations hold for reflections, 
but we delay a full treatment of these until we have described how lines and 
surfaces are handled in the conformal model. The preceding formulae for trans- 
lations and rotations form the basis of the subject of screw theory, which has its 
origins in the nineteenth century. 


10.3.3 Inversions 


Rotations and translations are elements of the Euclidean group, as they leave 
distances between points invariant. This is a subgroup of the larger conformal 
group, which only leaves angles invariant. The conformal group essentially con- 
tains two further transformations: inversions and dilations. An inversion in the 


origin consists of the map 


ge (10.66) 


T 
x 
The conformal vector corresponding to the inverted point is 
1 


F(a) Sa-*n+22°+-n= -2 (n+ 2a — xn). (10.67) 


But in conformal space points are represented homogeneously, so the pre-factor 
of £7? 
solely of the map 


can be ignored. In conformal space an inversion in the origin consists 


n= -ñ n= n. (10.68) 


ene = —eeñ = —in. (10.69) 


We can therefore write 
—eF(x)e = x? F(x7}), (10.70) 


which shows that inversions in V (p, q) are represented as reflections in the confor- 
mal space V(p+1,q+ 1). As both X and —X are homogeneous representations 
of the same point, it is irrelevant whether we take —e(...)e or e(...)e as the 
reflection. In the following we will use e(...)e for convenience. 

A reflection in e corresponds to an inversion in the origin in Euclidean space. 
To find the generator of an inversion in an arbitrary point a, we translate to the 
origin, invert and translate forward again. The resulting generator is then 


na an a 
neT-a = (14 1+ 2) = 10.71 
TaeT. (1 5 ) e ( 5 e—a- zn (10.71) 
Now, recalling that e = (n + /)/2, the generating vector can also be written as 
TaeT_a = 4(n— F(a)) = $(n— A). (10.72) 
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A reflection in (n — F'(a)) therefore achieves an inversion about the point a in 
Euclidean space. As with translations, a nonlinear transformation in Euclidean 
space has been linearised by moving to a conformal representation of points. The 
generator of an inversion is a vector with positive square. In section 10.5.1 we 
see how these vectors are related to circles and spheres. 


10.3.4 Dilations 
A dilation in the origin is given by 
cra’ =e “x, (10.73) 


where a is a scalar. Clearly, this transformation does not alter angles, so is 
a conformal transformation. The null vector corresponding to the transformed 
point is 

F(2') = eS (en + 2x + er). (10.74) 
Clearly the map we need to achieve is 


n= e Ân, n= en. (10.75) 


This transformation does not alter the inner product of n and ñ, so can be 
represented with a rotor. As the vector x is unchanged, the rotor can only be 
generated by the timelike bivector ee. If we set 


N = eē = 4ñ^n (10.76) 
then N satisfies 
Nn = -n = —nN, Na=n=—aANn, N? = 1; (10.77) 
We now introduce the rotor 
Da = eN/? = cosh(a/2) + sinh(a/2) N. (10.78) 


This rotor satisfies 


D —Q 
DanDa =e |n, 


~ (10.79) 
DanDa = en 
and so carries out the required transformation. We can therefore write 
F(e-%x) =e *DaF(2)Da, (10.80) 


which confirms that a dilation in the origin is represented by a simple rotor in 
conformal space. To achieve a dilation about an arbitrary point a we form 


D!, = Ta, Daf, = &N'/?, (10.81) 
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where the generator is now 
N' = T,NT, = 4TaiAnT, = —4AAn, (10.82) 
with A = F(a). A dilation about a is therefore generated by 


a AAn 

D' = -k A 4 = € Eeee à 1 . 
n = exp(—aAAn/4) p(s ot) (10.83) 
The generator is governed by two null vectors, one for the point about which the 


dilation is performed and one for the point at infinity. 


10.3.5 Special conformal transformations 


A special conformal transformation consists of an inversion in the origin, a trans- 
lation and a further inversion in the origin. We can therefore handle these in 
terms of the representations we have already established. In Euclidean space the 
effect of a conformal transformation can be written as 


r+ ar? 1 1 


1 +2a-x + a?r? ees arr oe Lowa 


(10.84) 


The final expressions confirm that a special conformal transformation corre- 
sponds to a position-dependent rotation and dilation in Euclidean space, so does 
leave angles unchanged. To construct the equivalent rotor in G(p + 1,q + 1) we 
form 

Ka = eTe =1- 2 


F (10.85) 


which ensures that Ka F (a) K a is a special conformal transformation. Explicitly, 
we have 


1 S 
F (e ) = (1 + 2a-x + ax’) Ka F(£)Ka (10.86) 
l +arx 


and again we can ignore the pre-factor and use K, aF(£)Ka as the homogeneous 
representation of the result of a special conformal transformation. 


10.3.6 Euclidean transformations 


The group of Euclidean transformations is a subgroup of the full conformal 
group. The additional restriction is that lengths as well as angles are invariant. 
Equation (10.53) showed that the inner product of two null vectors is related 
to the Euclidean distance between the corresponding points. To establish a 
homogeneous formula, we must write 


A-B 
An Bn’ 


(10.87) 
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which is homogeneous on A and B. The Euclidean group can now be seen to be 
the subgroup of the conformal group which leaves n invariant. This is sensible, 
as the point at infinity should stay there under a Euclidean transformation. 
The Euclidean group is thus the stability group of a null vector in conformal 
space. The group of generators of reflections and rotations in conformal space 
which leave n invariant then provide a double cover of the Euclidean group. 
Equation (10.87) returns the Euclidean distance between points. If the vector 
n is replaced by e or € we can transform to distance measures in hyperbolic or 
spherical geometry. This makes it a simple exercise to attach different geometric 
pictures to algebraic results in conformal space. 


10.4 Geometric primitives in conformal space 


Now that we have seen how points are encoded in conformal space, we can 
begin to build up more complex geometric objects. As in projective geometry, 
we expect that a multivector blade L will encode a geometric object via the 
equation 


LAX =0, X? =0. (10.88) 


The question, then, is what type of object does each grade of multivector return. 
One important result we can exploit is that X? = 0 is unchanged if X + RXR. 
So, if a geometric object is specified by L via equation (10.88), it follows that 


R(LAX)R = (RLR)A(RXR) = 0. (10.89) 


We can therefore transform the object L with a general element of the conformal 
group to obtain a new object. Similar considerations hold for incidence relations. 
Since conformal transformations only preserve angles, and do not necessarily map 
straight lines to straight lines, the range of objects we can describe by simple 
blades is clearly going to be larger than in projective geometry. 


10.4.1 Bivectors and points 


A pair of points in Euclidean space are represented by two null vectors in a space 
of two dimensions higher. We know that the inner product in this space returns 
information about distances. The next question to ask is what is the significance 
of the outer product of two vectors. If A and B are null vectors, we form the 
bivector 


G= AAB. (10.90) 
The bivector G has magnitude 
G? = (AB — A-B)(—BA+ A-B) = (A-B)’, (10.91) 
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which shows that G is timelike, borrowing the terminology of special relativity. 
It follows that G contains a pair of null vectors. If we look for solutions to the 
equation 


GAX =0, X? =0, (10.92) 


the only solutions are the two null vectors contained in G. These are precisely 
A and B, so the bivector encodes the two points directly. In the conformal 
model, no information is lost in forming the exterior product of two null vectors. 
Spacelike bivectors, with B? < 0, do not contain any null vectors, so in this case 
there are no solutions to BAX = 0 with X? = 0. The critical case of B? = 0 
implies that B contains a single null vector. 

Given a timelike bivector, B? > 0, we require an efficient means of finding the 
two null vectors in the plane. This can be achieved without solving any quadratic 
equations as follows. Pick an arbitrary vector a, with a partial projection in the 
plane, a-B Æ 0. If the underlying space is Euclidean, one can use the vector €, 
since all timelike bivectors contain a factor of this. Now remove the component 
of a outside the plane by defining 


a’ =a—a\BB, (10.93) 


where Ê = B/|B| is normalised so that B? = 1. If a’ is already null then it 
defines one of the required vectors. If not, then one can form two null vectors in 
the B plane by writing 


Az =a +d B. (10.94) 


One can easily confirm that A+ are both null vectors, and so return the desired 


points. 


10.4.2 Trivectors, lines and circles 


If a bivector now only represents a pair of points, the obvious question is how 
do we describe a line? Suppose we construct the line through the points a and 
b in V(p,q). A point on the line is given by 


z= da+(1—Ad)b. (10.95) 
The conformal version of this line is 
F(a) = (A7a? + 2X(1 — A)a-b + (1 — A)?b)n + 2Aa + 2(1 — A)L-— ñ 
= A+ (1-A)B+ 5X1 —A)NA-Bn, (10.96) 


and any multiple of this encodes the same point on the line. It is clear, then, 
that a conformal point X is a linear combination of A, B and n, subject to the 
constraint that X? = 0. This is summarised by 


(AABAn)AX =0, X?=0. (10.97) 
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So it is trivectors that represent lines in conformal geometry. This illustrates a 
general feature of the conformal model — geometric objects are represented by 
multivectors of one grade higher than their projective counterpart. The extra 
degree of freedom is absorbed by the constraint that X? = 0. 

As stated above, if we apply a conformal transformation to a trivector repre- 
senting a line, we must obtain a new line. But there is no reason to expect this to 
be straight. To see what else can result, consider a simple inversion in the origin. 
Suppose that (21,22) denote a pair of Cartesian coordinates for the Euclidean 
plane, and consider the line x, = 1. Points on the line have components (1, x2), 
with —oo < £2 < +00. The image of this line under an inversion in the origin 
has coordinates (x1, £4), where 


1 T2 


It is now straightforward to show that 
(x — 4)? + (e) = (4). (10.99) 
Hence inversion of a line produces a circle, centred on (1/2,0) and with radius 
1/2. 
It follows that a general trivector in conformal space can encode a circle, with 
a line representing the special case of infinite radius. This is entirely sensible, as 
three distinct points are required to specify a circle. The points define a plane, 


and any three non-collinear points in a plane specify a unique circle. So, given 
three points A1, Ag, A3, the circle through all three is defined by 


A; AA2\A3A\X = 0, (10.100) 
together with the restriction (often unstated) that X? = 0. The trivector 
L=A\AAgAA3 (10.101) 


therefore encodes a unique circle in conformal geometry. The test that the points 
lie on a straight line is that the circle passes through the point at infinity, 


LAn=0 = straight line. (10.102) 


This explains why our earlier derivation of the line through A; and Ag2 led to 
the trivector A4 A2An, which explicitly includes the point at infinity. Unlike 
tests for linear dependence, testing for zero in equation (10.102) is numerically 
acceptable. The reason is that the magnitude of LAn controls the deviation from 
straightness. If precision is limited, one can then define how close LAn should 
be to zero in order for the line to be treated as straight. This is quite different to 
linear independence, where the concept of ‘nearly independent’ makes no sense. 

Given that a trivector L encodes a circle, we should expect to be able to extract 
the key geometric properties of the circle directly from L. In particular, we seek 
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e2 


—e€ei (i 


Figure 10.7 The unit circle. Three reference points are marked on the 
circle. 


expressions for the centre and radius of the circle. (The plane containing the 
circle is specified by the 4-vector LAn, as we explain in the following section.) 
Any circle in a plane can be mapped onto any other by a translation and a 
dilation. Under that latter we find that 


Lan => (DaLDa)^n = eè Da(LAn) Da. (10.103) 


It follows that (LAn)? scales as the inverse square of the radius. Next, consider 
the unit circle in the circle in the xy plane, and take as three points on the circle 
those shown in figure 10.7. The trivector for this circle is 


Lo = F(e,)AF(e2)AF(—e1) = 16e1e2ē. (10.104) 
It follows that 
L ERO (10.105) 
(Lonn ' 


which is (minus) the square of the radius of the unit circle. We can translate 
and dilate this into any circle we choose, so the radius p of the circle encoded by 
the trivector L is given by 
L? 
2 
= ———_.. 10.106 

This is a further illustration of how metric information is carried around in the 
homogeneous framework of the conformal model. If L represents a straight line 
we know that LAn = 0, so the radius we obtain is infinite. 

Similar reasoning produces a formula for the centre of a circle. Essentially the 
only objects we have to work with are L and n. If we form LnL for the case of 
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the unit circle we obtain 
LonLo x e,e2énée eg = —n. (10.107) 


But 7 is the null vector for the origin, so this expression has returned the desired 
point. Again, we can translate and dilate this result to obtain an arbitrary circle, 
and we find in general that the centre C of the circle L is obtained by 


C = InL. (10.108) 


We will see in section 10.5.5 that the operation L... L generates a reflection in 
a circle. Equation (10.108) then says that the centre of a circle is the image of 
the point at infinity under a reflection in the circle. 


10.4.3 4-vectors, spheres and planes 


We can apply the same reasoning for lines and circles to the case of planes and 
spheres and, for mixed signature spaces, hyperboloids. Suppose initially that the 
points a,b,c define a plane in V(p, q), so that an arbitrary point in the plane is 
given by 


x = aa + bb + ye, a+P+y=l. (10.109) 
The conformal representation of x is 
X=aA+6B+7C+5n, (10.110) 
where A = f(a) etc., and 
ô = $(a8A-B+ayA-C + BYB-C). (10.111) 


Varying a and 8, together with the freedom to scale F(x), now produces general 
null combinations of the vectors A, B, C and n. The equation for the plane can 
then be written 


AABACAnAX =0. (10.112) 


The plane passes through the points defined by A, B, C and the point at infinity 
n. We can therefore see that a general plane in conformal space is defined by 
four points. 

If the four points in question do not lie on a (flat) plane, then the 4-vector 
formed from their outer product defines a sphere. To see this we again consider 
inversion in the origin, this time applied to the xı = 1 plane. A point on the 
plane has coordinates (1, £2, £3), and under an inversion this maps to the point 
with coordinates 

/ 1 ' y ' z 
as a a TE 2 Ta ga eTe (10.113) 
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The new coordinates satisfy 
(a, — 3)? + (24)? + (a)? = (4), (10.114) 


which is the equation of a sphere. Inversion thus interchanges planes and spheres. 
In particular, the point at infinity n is transformed to the origin m under inver- 
sion, which is now one of the points on the sphere. 

Given any four distinct points A1, ..., A4, not all on a line or circle, the equa- 
tion of the unique sphere through all four points is 


Ai AAgAA3A A4AX = PAX =0, (10.115) 


so the sphere is defined by the 4-vector P = AyA\A2AA3/\A4. The sphere is flat 
(a plane) if it passes through the point at infinity, the test for which is 


A; A Ag\A3\ AgAn = Pan = 0. (10.116) 


The 4-vector P contains all of the relevant geometric information for a sphere. 
The radius of the sphere p is given by 
P? 
2 
= 10.117 
as is easily confirmed for the case of the unit sphere, P = e)e2e3é. Similarly, the 
centre of the sphere C = F (ec) is given by 


C = PnP. (10.118) 


These formulae are the obvious generalisations of the results derived for circles. 


10.5 Intersection and reflection in conformal space 


One of the most significant advantages of the conformal approach to Euclidean 
geometry is the ease with which it solves complicated intersection problems. So, 
for example, finding the circle of intersection of two spheres is now no more 
complicated than finding the line of intersection of two planes. In addition, the 
concept of reflection is generalised in conformal space to include reflection in a 
sphere. This provides a very compact means of encoding the key concepts of 
inversive geometry. 


10.5.1 Duality in conformal space 


The concept of duality is key to intersecting objects in projective space, and the 
same is true in conformal space. Suppose that we start with the Euclidean plane, 
modelled in G(3,1). Duality in this algebra interchanges spacelike and timelike 
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bivectors. It also maps trivectors to vectors, and vice versa. A trivector encodes 
a line, or circle, so the dual of the circle C is a vector c, where 


c=C*=IC (10.119) 


and I is the pseudoscalar for G(3, 1). The equation for the circle, X AC = 0, can 
now be written in dual form and reduces to 
X-c=—I(XAC) =0. (10.120) 
The radius of the circle is now given by 
2 
c 
P = 7 9? 
(en) 


as the vector dual to a circle has positive signature. This picture provides us 


(10.121) 


with an alternative view of the concept of a point as being a circle of zero radius. 
Similar considerations hold for spheres in three-dimensional space. These are 
represented as 4-vectors in G(4,1), so their dual is a vector. We write 


s= S* = IS, (10.122) 
where I is the pseudoscalar, so that the equation of a sphere becomes 
X-s=I(XAS)=0. (10.123) 


The radius of the sphere is again given by 
2 A 
T 10.124 
so that points are spheres of zero radius. One can see that this is sensible by 
considering an alternative equation for a sphere. Suppose we are interested in 


the sphere with centre C and radius p°. The equation for this can be written 
X.C 2 


25 On =p". (10.125) 
Rearranging, this equation becomes 
X-(2C + p°C-nn) = 0, (10.126) 
and if C is in standard form, C = F(c), we obtain 
X-(F(c) — pn) =0. (10.127) 


We can therefore identify s = S* with the vector F'(c)—p?n, which neatly encodes 
the centre and radius of the sphere in a single vector. Whether the 4-vector S 
or its dual vector s is most useful depends on whether the sphere is specified by 
four points lying on it, or by its centre and radius. For a given sphere s we can 
now write 


s=X(2C + p?C-nn). (10.128) 
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It is then straightforward to confirm that the radius is given by equation (10.124). 


The centre of the circle can be recovered from 
C 8 p Sns 


= = ; 10.12 
Cn sn 2” 2(s-n)? (1012) 


The sns form for the centre of a sphere is dual to the SnS expression found in 
equation (10.118). 


10.5.2 Intersection of two lines in a plane 


As a simple example of intersection in the conformal model, consider the inter- 
section of two lines in a Euclidean plane. The lines are described by trivectors 
Lı and Lə in G(3,1). The intersection is described by the bivector 


B = (IŠ nL3)* = I(L x Lə), (10.130) 


where J is the conformal pseudoscalar. The bivector B can contain zero, one or 
two points, depending on the sign of its square, as described in section 10.4.1. 
This is to be expected, as distinct circles can intersect at a maximum of two 
points. If the lines are both straight, then one of the points of intersection will 
be at infinity, and BAn = 0. 

To verify this result, consider the case of two straight lines, both passing 
through the origin, and with the first line in the a direction and the second in 
the b direction. With suitable normalisation we can write 


Li =aN, Lə = bN, (10.131) 
where N = eé. The intersection of Lı and Lə is controlled by 
B=TIa\b« N (10.132) 


and the bivector N contains the null vectors n and n. This confirms that the 
lines intersect at the origin and infinity. Applying conformal transformations 
to this result ensures that it holds for all lines in a plane, whether the lines 
are straight or circular. The formulae for Lı and Lo also show that their inner 
product is related to the angle between the lines, 


We can therefore write 
(Lı L2) 
cos(@) = ———, 10.134 
OS Blt nee 


where |L| = \/(L?). This equation returns the angle between two lines. The 
quantity is invariant under the full conformal group, and not just the Euclidean 
group, because angles are conformal invariants. It follows that the same formula 
must hold even if Lı and Lə describe circles. The angle between two circles is 
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the angle made by their tangent vectors at the point of intersection. Two circles 
intersect at a right angle, therefore, if 


(Lı L2) =0. (10.135) 


This result can equally be expressed in terms of the dual vectors lı and lo. 


10.5.3 Intersection of a line and a surface 


Now suppose that the 4-vector P defines a plane or sphere in three-dimensional 
Euclidean space, and we wish to find the point of intersection with a line de- 
scribed by the trivector L. The algebra proceeds entirely as expected and we 
arrive at the bivector 


B = (P*AL*)* = (IP)-L = KPL)s. (10.136) 
This bivector can again describe zero, one or two points, depending on the sign of 
its square. This setup describes all possible intersections between lines or circles, 
and planes or spheres — an extremely wide range of applications. Precisely the 


same algebra enables us to answer whether a ring in space intersects a given 
plane, or whether a straight line passes through a sphere. 


10.5.4 Surface intersections 


Next, suppose we wish to intersect two surfaces in three dimensions. Suppose 
that these are spheres defined by the 4-vectors Sı and S2. Their intersection is 
described by the trivector 


L = I(S1 x S2). (10.137) 
This trivector directly encodes the circle formed from the intersection of two 
spheres. As with the bivector case, the sign of L? defines whether or not two 
surfaces intersect. If L? > 0 then the surfaces do intersect. If L? = 0 then the 
surfaces intersect at a point. Tests such as this are extremely helpful in graphics 
applications. 
We can similarly express the intersection in terms of the dual vectors sı and 
S2 aS 


L= Isi ^so. (10.138) 
As a check, the point X lies on both spheres if 
X-s1ı=X-s2=0. (10.139) 
It follows that 


X- (s1As2) = X-s, S2 — X -S2 S1 =0. (10.140) 
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The dual result is that XA(I s1^s2) = 0, which confirms that X lies in the space 
defined by the trivector L. 


10.5.5 Reflections in conformal space 


At various points in previous sections we have obtained formulae which generate 
reflections. We now discuss these more systematically. In section 2.6 we estab- 
lished that the vector obtained by reflecting a in the hyperplane perpendicular 
to l, ? = 1, is —lal. But this formula assumes that the line and plane intersect 
at the origin. We seek a more general expression, valid for an arbitrary line and 
plane. Let P denote the plane and L the line we wish to reflect in the plane, 
then the obvious candidate for the reflected line L’ is 


L' = PLP. (10.141) 


(The sign of this is irrelevant in conformal space.) To verify that this is correct, 
suppose that L passes through the origin in the a direction, 


L=aN3 (10.142) 
and the plane P is defined by the origin and the directions b and c, 
P=DACN. (10.143) 
In this case 
L' = bAcabAcN = (—(Iz bAc)a(Iz bAc)) N, (10.144) 


where [3 is the three-dimensional pseudoscalar. This result achieves the required 
result. The vector a is reflected in the b^c plane to obtain the desired direction. 
The outer product with N then defines the line through the origin with the 
required direction. Equation (10.141) is correct at the origin, so therefore holds 
for all lines and planes, by conformal invariance. 

There are a number of significant consequences of equation (10.141). The 
first is that it recovers the correct line in three dimensions without having to 
to find the point of reflection. The second is that it is straightforward to chain 
together multiple reflections by forming successive products with planes. In this 
way complicated reflections can be easily composed, all the time keeping track 
of the direction and position of the resultant line. A further consequence is that 
the same reflection formula must hold for higher dimensional objects. Suppose, 
for example, we wish to reflect the sphere S$ in the plane P. The result is 


S! = PSP. (10.145) 


This type of equation is extremely useful in dealing with wave propagation, where 
a wavefront is modelled as a series of expanding spheres. 
Conformal invariance of the reflection formula (10.141) ensures that the same 
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formula holds for reflection in a circle, or in a sphere. For example, suppose 
we wish to carry out a reflection in the unit circle in two-dimensional Euclidean 
space. The circle is defined by Lo = e1e2€, and the dual vector is 


ILo =e. (10.146) 
Reflection in the unit circle is therefore performed by the operation 
M = eMe. (10.147) 


This is an inversion, as discussed in section 10.3.3. In this manner, the main 
results of inversive geometry are easily formulated in terms of reflections in con- 
formal space. 


10.6 Non-Euclidean geometry 


The sudden growth in the subject of geometry in the nineteenth century was 
stimulated in part by the discovery of geometries with very different properties 
to Euclidean space. These were obtained by a simple modification of Euclid’s 
parallel postulate. For Euclidean geometry this states that, given any line | 
and a point P not on the line, there exists a unique line through P in the 
plane of l and P which does not meet l. This is then a line parallel to l. For 
many centuries this postulate was viewed as problematic, as it cannot be easily 
experimentally verified. As a result, mathematicians attempted to remove the 
parallel postulate by proving it from the remaining, uncontroversial, postulates 
of Euclidean geometry. This enterprise proved fruitless, and the reason why 
was discovered by Lobachevskii and Bolyai in the 1820s. One can replace the 
parallel postulate with a different postulate, and obtain a new, mathematically 
acceptable geometry. 

There are in fact two alternative geometries one can obtain, by replacing 
the statement that there is a single line through P which does not intersect 
l with either an infinite number or zero. The case of an infinite number pro- 
duces hyperbolic geometry, which is the non-Euclidean geometry constructed by 
Lobachevskii and Bolyai. (In this section ‘non-Euclidean’ usually refers to the 
hyperbolic case.) The case of zero lines produces spherical geometry. Intuitively, 
the spherical case corresponds to space curling up, so that all (straight) lines 
meet somewhere, and the hyperbolic case corresponds to space curving outwards, 
so that lines do not meet. From the more modern perspective of Riemannian 
geometry, we are talking about homogeneous, isotropic spaces, which have no 
preferred points or directions. These can have positive, zero or negative curva- 
ture, corresponding to spherical, Euclidean and hyperbolic geometries. Today, 
the question of which of these correctly describes the universe on the largest 
scales remains an outstanding problem in cosmology. 

An extremely attractive feature of the conformal model of Euclidean geometry 
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Figure 10.8 Circle limit III by Maurits Escher. ©2002 Cordon Art B.V., 
Baarn, Holland. 


is that, with little modification, it can be applied to both hyperbolic and spherical 
geometries as well. In essence, the geometry reduces to a choice of the point 
at infinity, which in turn fixes the distance measure. This idea replaces the 
concept of the absolute conic, adopted in classical projective geometry as a means 
of imposing a distance measure. In this section we illustrate these ideas with 
a discussion of the conformal approach to planar hyperbolic geometry. As a 
concrete model of this we concentrate on the Poincaré disc. This version of 
hyperbolic geometry is mathematically very appealing, and also gives rise to 
some beautiful graphic designs, as popularised in the prints of Maurits Escher 
(see figure 10.8). 


10.6.1 The Poincaré disc 


The Poincaré disc D consists of the set of points in the plane a distance r < 1 
from the origin. At first sight this may not appear to be homogeneous, but in 
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Figure 10.9 The Poincaré disc. Points inside the disc represent points in 
a hyperbolic space. A set of d-lines are also shown. These are (Euclidean) 
circles that intersect the unit circle at right angles. The d-lines through A 
illustrate the parallel postulate for hyperbolic geometry. 


fact the nature of the geometry will ensure that there is nothing special about 
the origin. Note that points on the unit circle r = 1 are not included in this 
model of hyperbolic geometry. The key to this geometry is the concept of a 
non-Euclidean straight line. These are called d-lines, and represent geodesics in 
hyperbolic geometry. A d-line consists of a section of a Euclidean circle which 
intersects the unit circle at a right angle. Examples of d-lines are illustrated in 
figure 10.9. Given any two points in the Poincaré disc there is a unique d-line 
through them, which represents the ‘straight’ line between the points. It is now 
clear that for any point not on a given d-line l, there are an infinite number of 
d-lines through the point which do not intersect l. 

We can now begin to encode these concepts in the conformal setting. We 
continue to denote points in the plane with homogeneous null vectors in precisely 
the same manner as the Euclidean case. Suppose, then, that X and Y are the 
conformal vectors representing two points in the disc. The set of all circles 
through these two points consists of trivectors of the form XAY AA, where A 
is an additional point. But we require that the d-line intersects the unit circle 
at right angles. The unit circle is described by the trivector Ie, where I is the 
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pseudoscalar in G(3,1). If a line L is perpendicular to the unit circle it satisfies 
(Ie)-L = I(eAL) =0. (10.148) 


It follows that all d-lines contain a factor of e. The d-line through X and Y must 
therefore be described by the trivector 


L=XAY he. (10.149) 


One can see now that a general scheme is beginning to emerge. Everywhere in 
the Euclidean treatment that the vector n appears it is replaced in hyperbolic 
geometry by the vector e. This vector represents the circle at infinity. 

Given a pair of d-lines, they can either miss each other, or intersect at a point 
in the disc D. If they intersect, the angle between the lines is given by the 


Euclidean formula 


Li-L 
0) = 10.150 
cos(4) Ia] [Lal ( ) 


It follows that angles are preserved by a general conformal transformation in 
hyperbolic geometry. A non-Euclidean transformation takes d-lines to d-lines. 
The transformation must therefore map (Euclidean) circles to circles, while pre- 
serving orthogonality with e. The group of non-Euclidean transformations must 
therefore be the subgroup of the conformal group which leaves e invariant. This 
is confirmed in the following section, where we find the appropriate distance 
measure for non-Euclidean geometry. 

The fact that the point at infinity is represented by e, as opposed to n in 
the Euclidean counterpart, provides an additional operation in non-Euclidean 
geometry. This is inversion in e: 


X m= eXe. (10.151) 


As all non-Euclidean transformations leave e invariant, all geometric relations 
remain unchanged under this inversion. Geometrically, the interpretation of the 
inversion is quite clear. It maps everything inside the Poincaré disc to a ‘dual’ 
version outside the disc. In this dual space incidence relations and distances are 
unchanged from their counterparts inside the disc. 


10.6.2 Non-Euclidean translations and distance 


The key to finding the correct distance measure in non-Euclidean geometry is 
to first generalise the concept of a translation. Given points X and Y we know 
that the d-line connecting them is defined by XAY Ae. This is the non-Euclidean 
concept of a straight line. A non-Euclidean translation must therefore move 
points along this line. Such a transformation must take X to Y, but must also 
leave e invariant. The generator for such a transformation is the bivector 


B = (XAY ^e)e = Le, (10.152) 
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Figure 10.10 A non-Euclidean translation. The figure near the origin is 
translated via a boost to give the distorted figure on the right. This dis- 
tortion in the Poincaré disc is one way of visualising the effect of a Lorentz 
boost in spacetime. 


where L = XAY Ae. We find immediately that 
B? = TP >0, (10.153) 


so non-Euclidean translations are hyperbolic transformations, as one might ex- 
pect. An example of such a translation is shown in figure 10.10. 
We next define 


x B a 
B= IB} B? =1, (10.154) 


so that we can write 
Y = 08/2 Xe 08/2 (10.155) 


By varying a we obtain the set of points along the d-line through X and Y. To 
obtain a distance measure, we first require a formula for a. If we decompose X 
into 


X=XB?=xX.BB+XABB (10.156) 
we obtain 
Y = X\BB + cosh(a) X-B B-—sinh(a) X-B. (10.157) 
The right-hand side must give zero when contracted with Y, so 


(XAB BAY) + cosh(a)(X-B B-Y) + sinh(a) (XAY)-B =0. (10.158) 
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To simplify this equation we first find 
> XA(XAYAee) eXL 


XAB = (10.159) 
|B| |L] 
and 
j L2 
(XAY) B= Bj = |L]. (10.160) 
It follows that 
e-X eY +cosh(a)(X-Y —e-X e-Y) + sinh(a) |L| = 0, (10.161) 
the solution to which is 
XY 
h = 1- —— . 10.162 
ron) X-eY-e ( ) 


The half-angle formula is more relevant for the distance measure, and we find 
that 
X-Y 
 2X-eY-e 
This closely mirrors the Euclidean expression, with n replaced by e. 
There are a number of obvious properties that a distance measure must satisfy. 
Among these is the additive property that 


d(Xı, X2) + d(X2, X3) = d( X1, X3) (10.164) 


sinh?(a/2) = (10.163) 


for any three points X1, X2, X3 in this order along a d-line. Returning to the 
translation formula of equation (10.155), suppose that Z is a third point along 
the line, beyond Y. We can write 


Z = B/2y_-BB/2 — ela + AB yela + 8)B/2, (10.165) 


Clearly it is hyperbolic angles that must form the appropriate distance measure. 
No other function satisfies the additive property. We therefore define the non- 
Euclidean distance by 


x.y 1/2 
) (10.166) 


— 2sinh-! ( — 
d(x,y) = 2sinh ( IX eV 


In terms of the position vectors x and y in the Poincaré disc we can write 


1 |z — yl? ne 
d = 2sinh © | ——_.—_~ 10.167 
(«,y) = 25! (7 Eo) (10.167) 
where the modulus refers to the Euclidean distance. The presence of the arcsinh 
function in the definition of distance reflects the fact that, in hyperbolic geome- 
try, generators of translations have positive square and the appropriate distance 
measure is the hyperbolic angle. Similarly, in spherical geometry translations 
correspond to rotations, and it is the trigonometric angle which plays the role 
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of distance. Euclidean geometry is therefore unique in that the generators of 
translations are null bivectors. For these, combining translations reduces to the 
addition of bivectors, and hence we recover the standard definition of Euclidean 
distance. 


10.6.8 Metrics and physical units 


The derivation of the non-Euclidean distance formula of equation (10.166) forces 
us to face an issue that has been ignored to date. Physical distances are di- 
mensional quantities, whereas our formulae for distances in both Euclidean and 
non-Euclidean geometries are manifestly dimensionless, as they are homogeneous 
in X. To resolve this we cannot just demand that the vector x has dimensions, 
as this would imply that the conformal vector X contained terms of mixed di- 
mensions. Neither can this problem be circumvented by assigning dimensions of 
distance to ñ and (distance)~+ to n, as then e has mixed dimensions, and the 
non-Euclidean formula of (10.166) is non-sensical. 

The resolution is to introduce a fundamental length scale, A, which is a positive 
scalar with the dimensions of length. If the vector x has dimensions of length, 
the conformal representation is then given by 

X= za (a?n + 20x — Xn). (10.168) 
This representation ensures that X remains dimensionless, and is nothing more 
than the conformal representation of x/A. Physical distances can then be con- 
verted into a dimensionally meaningful form by including appropriate factors 
of A. Curiously, the introduction of A into the spacetime conformal model has 
many similarities to the introduction of a cosmological constant A = \?. 

We can make contact with the metric encoding of distance by finding the 
infinitesimal distance between the points x and x + dx. This defines the line 


element 
dx? 
2_ 4\4 
where the factors of A have been included and x is assumed to have dimensions 
of distance. This line element is more often seen in polar coordinates, where it 
takes the form 
44 
(\2 — r2)2 ( 


This is the line element for a space of constant negative curvature, expressed in 


ds? = dr? + r?d6?). (10.170) 


terms of conformal coordinates. The coordinates are conformal because the line 
element is that of a flat space multiplied by a scaling function. The geodesics 
in this geometry are precisely the d-lines in the Poincaré disc. The Riemann 
curvature for this metric shows that the space has uniform negative curvature, 
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so the space is indeed homogeneous and isotropic — there are no preferred points 
or directions. The centre of the disc is not a special point, and indeed it can be 
translated to any other point by ‘boosting’ along a d-line. 


10.6.4 Midpoints and circles in non-Euclidean geometry 


Now that we have a conformal encoding of a straight line and of distance in non- 
Euclidean geometry, we can proceed to discuss concepts such as the midpoint of 
two points, and of the set of points a constant distance from a given point (a 
non-Euclidean circle). Suppose that A and B are the conformal vectors of two 
points in the Poincaré disc. Their midpoint C lies on the line L = AA BAe and 
is equidistant from both A and B. The latter condition implies that 


C-A C-B 


= $ 10.171 
C-eAe C-eB-e (ON 
Both of the conditions for C are easily satisfied by setting 
A B 
C= 2AE + 2Be + ae, (10.172) 
where œ must be chosen such that C? = 0. Normalising to C-e = —1 we find 
that the midpoint is 
1 A B 
C= H 1+6-1 ; 10.173 
Vir Ge ape AF Je) ( ) 
where 
A-B 
= ——__., 10.174 
° 2A-eB-e CO) 


An equation such as this is rather harder to achieve without access to the con- 
formal model. 

Next suppose we wish to find the set of points a constant (non-Euclidean) 
distance from the point C. This defines a non-Euclidean circle with centre C. 
From equation (10.166), any point X on the circle must satisfy 


X.C 


-zy ggg 7 Constant = a, (10.175) 
so that the radius is sinh” ‘(a). It follows that 
X-(C + 2a°C-ee) = 0. (10.176) 
If we define s by 
s=C+20°C-ee (10.177) 


we see that s? > 0, and the circle is defined by X-s = 0. But this is precisely the 
formula for a circle in Euclidean geometry, so non-Euclidean circles still appear 
as ordinary circles when plotted in the Poincaré disc. The only difference is the 
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Figure 10.11 Non-Euclidean circles. A series of non-Euclidean circles with 
differing radii are shown, all about the common centre A. A d-line through 
A is also shown. This intersects each circle at a right angle. 


interpretation of their centre. The Euclidean centre of the circle s, defined by 
sns, does not coincide with the non-Euclidean centre C. This is illustrated in 
figure 10.11. 

Suppose that A, B and C are three points in the Poincaré disc. We can still 
define the line L through these points by 


L= AABAC, (10.178) 


and this defines the circle through the three points regardless of the geometry we 
are working in. All that is different in the two geometries is the position of the 
midpoint and the size of the radius. The test that the three points lie on a d-line 
is simply that LAe = 0. Again, the Euclidean formula holds, but with n replaced 
by e. Similar comments apply to other operations in conformal space, such as 
reflection. Given a line L, points are reflected in this line by the map X => LX L. 
This formula is appropriate in both Euclidean and non-Euclidean geometry. In 
the non-Euclidean case it is not hard to verify that LX L corresponds to first 
finding the d-line through X intersecting L at right angles, and then finding the 
point on this line an equal non-Euclidean distance on the other side. This is as 
one would expect for the definition of reflection in a line. 
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10.6.5 A unified framework for geometry 


We have so far seen how Euclidean and hyperbolic geometries can both be han- 
dled in terms of null vectors in conformal space. The key concept is the vector 
representing the point at infinity, which remains invariant under the appropriate 
symmetry group. The full conformal group of a space with signature (p,q) is 
the orthogonal group O(p + 1,q + 1). The group of Euclidean transformations 
is the subgroup of O(p + 1,q + 1) that leaves the vector n invariant. The hyper- 
bolic group is the subgroup of O(p + 1,q + 1) which leaves e invariant. For the 
case of planar geometry, with signature (2,0), the hyperbolic group is O(2,1). 
The Killing form for this group is non-degenerate (see chapter 11), which makes 
hyperbolic geometry a useful way of compactifying a flat space. 

The remaining planar geometry to consider is spherical geometry. By now, it 
should come as little surprise that spherical geometry is handled in the conformal 
framework in terms of transformations which leave the vector ë invariant. For 
the case of the plane, the conformal algebra has signature (3,1), with é the basis 
vector with negative signature. The subgroup of the conformal group which 
leaves € invariant is therefore the orthogonal group O(3,0), which is the group 
one expects for a 2-sphere. The distance measure for spherical geometry is 


xX.y 7 


~~ 10.1 
2X-EY-é asd 


d(x,y) = 2Asin7! ( 
with € replacing n in the obvious manner. To see that this expression is correct, 
suppose that we write 

X 

X-é 

where ĉ is a unit vector built in the three-dimensional space spanned by the 
vectors €41, €2 and e. With Y/Y -é written in the same way we find that 

XY 1-29 

2X-eY-e 2 
where 0 is the angle between the unit vectors on the 2-sphere. The distance 
measure is then precisely the angle 0 multiplied by the dimensional quantity A, 
which represents the radius of the sphere. 
Conformal geometry provides a unified framework for the three types of planar 


=#-@, (10.180) 


= sin?(0/2), (10.181) 


geometry because in all cases the conformal groups are the same. That is, the 
group of transformations of sphere that leave angles in the sphere unchanged is 
the same as for the plane and the hyperboloid. In all cases the group is O(3,1). 
The geometries are then recovered by a choice of distance measure. In classical 
projective geometry the distance measure is defined by the introduction of the 
absolute conic. All lines intersect this conic in a pair of points. The distance 
between two points A and B is then found from the four-point ratio between A, 
B, and the two points of intersection of the line through A and B and the absolute 
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conic. In this way all geometries are united in the framework of projective 
geometry. But there is a price to pay for this scheme — all coordinates have to 
be complex, to ensure that all lines intersect the conic in two points. Recovering 
a real geometry is then rather clumsy. In addition, the conformal group is not a 
subgroup of the projective group, so much of the elegant unity exhibited by the 
three geometries is lost. Conformal geometry is a more powerful framework for 
a unified treatment of these geometries. Furthermore, the conformal approach 
can be applied to spaces of any dimension with little modification. Trivectors 
represent lines and circles, 4-vectors represent planes and spheres, and so on. 

So far we have restricted ourselves to a single view of the various geometries, 
but the discussion of the sphere illustrates that there are many different ways of 
representing the underlying geometry. To begin with, we have plotted points on 
the Euclidean plane according the the formula 


XAN 
— N 
Xn? 


g = 


(10.182) 


where N = eë. This is the natural scheme for plotting on a Euclidean piece of 
paper, as it ensures that the angle between lines on the paper is the correct angle 
in each of the three geometries. Euclidean geometry plotted in this way recovers 
the obvious standard picture of Euclidean geometry. Hyperbolic geometry led 
to the Poincaré disc model, in which hyperbolic lines appear as circles. For 
spherical geometry the ‘straight lines’ are great circles on a sphere. On the plane 
these also plot as circles. This time the condition is that all circles intersect the 
unit circle at antipodal points. This then defines the spherical line between 
two points (see figure 10.12). This view of spherical geometry is precisely that 
obtained from a stereographic projection of the sphere onto the plane. This 
is not a surprise, as the conformal model was initially constructed in terms of 
a stereographic projection, with the € vector then enabling us to move to a 
homogeneous framework. In this representation of spherical geometry the map 


X m= éXe (10.183) 


is a symmetry operation. This maps points to their antipodal opposites on the 
sphere. In the planar view this transformation is an inversion in the unit circle, 
followed by a reflection in the origin. 

We now have three separate geometries, all with conformal representations in 
the plane such that the true angle between lines is the same as that measured on 
the plane. The price for such a representation is that straight lines in spherical 
and hyperbolic geometries do not appear straight in the plane. But we could 
equally choose to replace the map of equation (10.182) with an alternative rule of 
how to plot the null vector X on a planar piece of paper. The natural alternatives 
to consider are replacing the vector n with e and é. In total we then have three 
different planar realisations of each of the two-dimensional geometries. First, 
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Figure 10.12 Stereographic view of spherical geometry. All great circles on 
the 2-sphere project onto circles in the plane which intersect the unit circle 
(shown in bold) at antipodal points. A series of such lines are shown. 


suppose we define 


= N. 10.184 
Y= ( ) 
In terms of the vector x we have 
2x 
10.1 
y= (10.185) 


which represents a radial rescaling. Euclidean straight lines now appear as hy- 
perbolae or ellipses, depending on whether or not the original line intersected 
the disc. If the line intersected the disc then the map of equation (10.185) has 
two branches and defines a hyperbola. If the line misses the disc then an ellipse 
is obtained. In all cases the image lines pass through the origin, as this is the 
image of the point at infinity. 

The fact that the map of equation (10.185) is two-to-one means it has little 
use as a version of Euclidean geometry. It is better suited to hyperbolic geom- 
etry, as one might expect, as the Poincaré disc is now mapped onto the entire 
plane. Hyperbolic straight lines now appear as (single-branch) hyperbolae on 
the Euclidean page, all with their asymptotes crossing at the origin. If the dual 
space outside the disc is included in the map, then this generates the second 
branch of each hyperbola. Points then occur in pairs, with each point paired 
with its image under reflection in the origin. Finally, we can consider spheri- 
cal geometry as viewed on a plane through the map of equation (10.185). This 
defines a standard projective map between a sphere and the plane. Antipodal 
points on the sphere define the same point on the plane and spherical straight 
lines appear as straight lines. 

Similarly, we can consider plotting vectors in the plane according to 
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XAN F(x)AN 
= N= N 10.186 
ns a F(a)-é ( ) 
or in terms of the vector x 
2x 
= 10.1 
Y 14+ 2? Oey) 


This defines a one-to-one map of the unit disc onto itself, and a two-to-one map 
of the entire plane onto the disc. Euclidean straight lines now appear plotted as 
ellipses inside the unit disc. This construction involves forming a stereographic 
projection of the plane onto the 2-sphere, so that lines map to circles on the 
sphere. The sphere is then mapped onto the plane by viewing from above, so 
that circles on the sphere map to ellipses. All ellipses pass through the origin, 
as this is the image of the point at infinity. 

Similar comments apply to spherical geometry. Spherical lines are great circles 
on the sphere, and viewed in the plane according to equation (10.187) great circles 
appear as ellipses centred on the origin and touching the unit circle at their 
endpoints. The two-to-one form of the projection means that circle intersections 
are not faithfully represented in the disc as some of the apparent intersections 
are actually caused by points on opposite sides of the plane. Finally, we consider 
plotting hyperbolic geometry in the view of equation (10.187). The disc maps 
onto itself, so we do have a faithful representation of hyperbolic geometry. This 
is a representation in which hyperbolic lines appear straight on the page, though 
angles are not rendered correctly, and non-Euclidean circles appear as ellipses. 

As well as viewing each geometry on the Euclidean plane, we can also picture 
the geometries on a sphere or a hyperboloid. The spherical picture is obtained 
in equation (10.180), and the hyperboloid view is similarly obtained by setting 


X 
where ĉ? = —1. The set of ĉ defines a pair of hyperbolic sheets in the space 


defined by the vectors {e1, e2,é}. The fact that two sheets are obtained explains 
why some views of hyperbolic geometry end up with points represented twice. 
So, as well as three geometries (defined by a transformation group) and a variety 
of plotting schemes, we also have a choice of space to draw on, providing a large 
number of alternative schemes for studying the three geometries. At the back of 
all of this is a single algebraic scheme, based on the geometric algebra of confor- 
mal space. Any algebraic result involving products of null vectors immediately 
produces a geometric theorem in each geometry, which can be viewed in a variety 
of different ways. 
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10.7 Spacetime conformal geometry 


As a final application of the conformal approach to geometry we turn to space- 
time. The conformal geometric algebra for a spacetime with signature (1,3) is 
the six-dimensional algebra with signature (2,4). The algebra G(2,4) contains 
64 terms, which decompose into graded subspaces of dimensions 1, 6, 15, 20, 15, 
6 and 1. As a basis for this space we use the standard spacetime algebra basis 
{7u}, together with the additional vectors {e,é}. The pseudoscalar J is defined 
by 


I = y0717273€6. (10.189) 


This has negative norm, I? = —1. The conformal algebra allows us to simply 
encode ideas such as closed circles in spacetime, or light-spheres centred on an 
arbitrary point. 

The conformal algebra of spacetime also arises classically in a slightly differ- 
ent setting. In conformal geometry, circles and spheres are represented homoge- 
neously as trivectors and 4-vectors. These are unoriented because L and — L are 
used to encode the same object. A method of dealing with oriented spheres was 
developed by Sophus Lie and is called Lie sphere geometry. A sphere in three 
dimensions can be represented by a vector s in the conformal algebra G(4,1), 
with s? > 0. Lie sphere geometry is obtained by introducing a further basis 
vector of negative signature, f, and replacing s by the null vector 


š=s8s+]|s|f, 3 =0. (10.190) 


Now the spheres encoded by s and —s have different representations as null 
vectors in a space of signature (4,2). This algebra is ideally suited to handling the 
contact geometry of spheres. The signature shows that this space is isomorphic 
to the conformal algebra of spacetime, so in a sense the introduction of the vector 
f can be thought of as introducing a time direction. A sphere can then be viewed 
as a light-sphere allowed to grow for a certain time. Orientation for spheres is 
then handled by distinguishing between incoming and outgoing light-spheres. 

The conformal geometry of spacetime is a rich and important subject. The 
Poincaré group of spacetime translations and rotations is a subgroup of the full 
conformal group, but in a number of subjects in theoretical physics, including 
supersymmetry and supergravity, it is the full conformal group that is relevant. 
One reason is that conformal symmetry is present in most massless theories. This 
symmetry then has consequences that can carry over to the massive regime. We 
will not develop the classical approach to spacetime conformal geometry further 
here. Instead, we concentrate on an alternative route through to conformal 
geometry, which unites the multiparticle spacetime algebra of chapter 9 with the 
concept of a twistor. 
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10.7.1 The spacetime conformal group 


For most of this chapter we have avoided detailed descriptions of the relationships 
between the groups involved in the geometric algebra formulation of conformal 
geometry. For the following, however, it is helpful to have a clearer picture of 
precisely how the various groups fit together. The subject of Lie groups in gen- 
eral is discussed in chapter 11. The spacetime conformal group C(1,3) consists 
of spacetime maps x + f(a) that preserve angles. This is the definition first 
encountered in section 10.3. The group of orthogonal transformations O(2, 4) 
is a double-cover representation of the conformal group, because in conformal 
space both X and —X represent the same spacetime point. As with Lorentz 
transformations, we are typically interested in the restricted conformal group. 
This consists of transformations that preserve orientation and time sense, and 
contains translations, proper orthochronous rotations, dilations and special con- 
formal transformations. The restricted orthogonal group, SO* (2,4), is a double- 
cover representation of the restricted conformal group. 

We can form a double-cover representation of SO*(2,4) by writing all re- 
stricted orthogonal transformations as rotor transformations a > RaR. The 
group of conformal rotors, denoted spin* (2, 4), is therefore a four-fold covering 
of the restricted conformal group. The rotor group in G(2,4) is isomorphic to 
the Lie group SU(2,2). It follows that the action of the restricted conformal 
group can be represented in terms of complex linear transformations of four- 
dimensional vectors, in a complex space of signature (2,2). This is the basis 
of the twistor program, initiated by Roger Penrose. Twistors were introduced 
as objects describing the geometry of spacetime at a ‘pre-metric’ level, one of 
the aims being to provide a route to a quantum theory of gravity. Instead of 
points and a metric, twistors represent incidence relations between null rays. 
Spacetime points and their metric relations then emerge as a secondary concept, 
corresponding to the points of intersection of null lines. 

As a first step in understanding the twistor program, we establish a concrete 
representation of the conformal group within the spacetime algebra. The key to 
this is the observation that the spinor inner product 


(bba = (Vo) — (Volos) Ios (10.191) 


defines a complex space with precisely the required metric. The complex struc- 
ture is represented by right-multiplication by combinations of 1 and Iø}, as 
discussed in chapter 8. We continue to refer to Y and ¢ as spinors, as they are 
acted on by a spin representation of the restricted conformal group. To establish 
a representation in terms of operators on 4%, we first form a representation of the 
bivectors in G(2,4) as 


EYu WP yolo3 = Yup IN, 


7 (10.192) 
EY Iyapo. 
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A representation of the even subalgebra of G(2,4), and hence an arbitrary rotor, 
can be constructed from these bivectors. The representation of each of the oper- 
ations in the restricted conformal group can now be constructed from the rotors 
found in section 10.3. We use the same symbol for the spinor representation of 
the transformations as the vector case. A translation by the vector a has the 
spin representation 


Tal) = Y + apblya5 (1+ 03). (10.193) 


The spinor inner product of equation (10.191) is invariant under this transfor- 
mation. To confirm this, suppose that we set 


p =T,(w) and ¢' =T,(¢). (10.194) 
The quantum inner product contains the terms 
(Wo) = ((¢ + abl 734 (1 + @3)) (H — 4 (1 — 0s) Iysha)) 
= (Wd) (10.195) 


and 


(h'g Ios) = ((¢ +apIy34 (1+ o3))lo3(b — į (1 — o3) Iysha)) 
= (polos). (10.196) 
It follows that 
(b'b')q = (VP) (10.197) 


as expected. 
The spinor representation of a rotation about the origin is precisely the space- 
time algebra rotor, so we can write 


Ro(y) = Ry, (10.198) 


where Ro denotes a rotation in the origin, and R is a spacetime rotor. Rotations 
about arbitrary points are constructed from combinations of translations and 
rotations. The dilation x + exp(a)x has the spinor representation 


DAET (10.199) 


This represents a dilation in the origin. Dilations about a general point are 
also obtained from a combination of translations and a dilation in the origin. 
The representation of the restricted conformal group is completed by the special 
conformal transformations, which are represented by 


Ka(b) = Y — aply35 (1 -— 03). (10.200) 


It is a routine exercise to confirm that the preceding operations do form a spin 
representation of the restricted conformal group. 


385 


GEOMETRY 


The full conformal group includes inversions. These can be represented as 
antiunitary operators. An inversion in the origin is represented by 


prey = pIo. (10.201) 
The effect of this on the inner product of equation (10.191) is that we form 
(ho'a = (Pv)q = (hoa). (10.202) 
This representation of an inversion in the origin satisfies 
Dalblo2) = D-a(Y)Io2, (10.203) 


as required. 


10.7.2 Multiparticle representation of conformal vectors 


We have defined a carrier space for a spin-1/2 representation of the spacetime 
conformal group. A vector representation of the conformal groups can therefore 
be constructed from quadratic combinations of spinors. Spinors can be thought 
of as belonging to a complex four-dimensional space. The tensor product space 
therefor contains 16 complex degrees of freedom. This decomposes into a ten- 
dimensional symmetric space and six-dimensional antisymmetric space. The six 
complex degrees of freedom in the antisymmetric representation are precisely 
the dimensions required to construct a conformal vector. The ten-dimensional 
symmetric space has 20 real degrees of freedom, and forms a representation of 
trivectors in conformal spacetime. 

In principle, then, we will form complex vectors in conformal spacetime. But 
for a special class of spinor the conformal vector is real. If we translate a constant 
spinor by the position vector r = xy, we form the object 


T, (ov) = Y + ryly35 (1+ 03), (10.204) 


which is the spacetime algebra version of a twistor. A twistor is essentially a 
spacetime algebra spinor with a particular position dependence. The key to 
constructing a real conformal vector from an antisymmetric pair of twistors is 
to impose the conditions that they are both null, and orthogonal. Suppose that 
we set 


X =T, (4), Z=T,(¢). (10.205) 


The conditions that these generate a real conformal vector are then 


(XX)q = (ZZ)q = (XZ), = 0. (10.206) 


The position dependence in X and Z does not affect the inner product, so the 
same conditions must also be satisfied by w and ¢. Choosing appropriate spinors 
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satisfying these relationships essentially amounts to a choice of origin. The most 
straightforward way to satisfy the requirements is to set 


X = ws (1 — o3) + rwly34 (1 + 03) (10.207) 
and 


Z = k} (1 — 03) + rK1735 (1 + 03), (10.208) 


where w and « are Pauli spinors (spinors in the spacetime algebra that commute 
with Jo). 

To construct a vector from the two twistors X and Z we form their antisym- 
metrised tensor product in the multiparticle spacetime algebra. We therefore 
construct the multivector 


Y, = (X1Z? — Z!X2)E, (10.209) 


where the notation follows section 9.2. If we now make use of the results in 
table 9.2 we find that 


dy = (r-re—rinygd — E) loskw)q, (10.210) 
where 7 is the Lorentz singlet state defined in equation (9.93), and € and € are 
defined by 

e=n3(1+o3), €=n3(1—o4). (10.211) 
The two-particle state ~ closely resembles our standard encoding of a point as a 
null vector in conformal space. The singlet state € represents the point at infinity, 
and is the spacetime algebra version of the infinity twistor. The opposite ideal, 
é, represents the origin (r = 0). 

More generally, given arbitrary single-particle spinors, we arrive at a complex 


six-dimensional vector. Restricting to the real subspace, a general point in this 
space can be written as the state 


pp =(V—-W)e+ Png + (V +W), (10.212) 
where 
P=Thot Xy Y+ 273. (10.213) 
To form the inner product of such states we require the results that 
(€€)q = (€)q = 0, A(éé)g = 1. (10.214) 


Now forming the quantum norm for the state wp we find that 


2(bpwp)q =T? +V? - W? - X? -Y° - 2. (10.215) 


So (V,W,T, X,Y,Z) are the coordinates of a six-dimensional vector in a space 
with signature (2,4). This establishes the map between a two-particle antisym- 
metrised spinor and a conformal vector. 
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Our ‘real’ state Y, can be cast into standard form by removing the complex 
factor on the right-hand side and setting 

Ur 
A (wr €) g 
Once this is done, all reference to the original w and «k spinors is removed. The 
inner product between two two-particle states y, and @s, where @, represents 


Yr > (10.216) 


the point s, returns 


(r$s)q = (r—s)-(r—s). (10.217) 
Ar €)q(Ps€)q 
The multiparticle inner product therefore recovers the square of the spacetime 
distance between points. This result is one reason why points are encoded 
through pairs of null twistors. 

We have now established a complete representation of conformal vectors for 
spacetime in terms of antisymmetrised products of a class of spinors, each eval- 
uated in a single copy of the spacetime algebra. We should now check that 
our representation of the conformal group through its action on spinors induces 
the correct vector representation in the two-particle algebra. We start with our 
standard multiparticle representation of a conformal vector as 


bp =rre—r'nygd — E. (10.218) 


The first operation to consider is a translation. The spinor representation of a 
translation by a induces the map 


Pr Pr = Tar Taz Wr. (10.219) 
After some algebra we establish that 
by, = (r +a) (r +a)e-— (r+ a)n — E, (10.220) 


as required. 

Next consider a Lorentz rotation centred on the origin. These are easily ac- 
complished as they correspond to multiplying the single-particle spinor by the 
appropriate rotor. This induces the map 


Yr =œ RR? yb, = r-r RR — Rr R?nygJ — R'R°E 
=r-re—(RrR) yA J — z, (10.221) 


which achieves the desired rotation. Reflections in planes through the origin are 
equally easily achieved through the single-particle antiunitary operation 


pre lave, (10.222) 
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where a is the normal vector to the plane of reflection. Applied to the two-particle 
state we obtain 


Wr a-a(r-re+ (ara ™™) tny — €), (10.223) 


which is the conformal representation of the reflected vector —ara~!. As we 


also have a representation of translations, we can rotate and reflect about an 
arbitrary point. 

Inversions in the origin are handled in conformal space by an operation that 
swaps the vectors representing the origin and infinity. In the multiparticle setting 
we must therefore interchange e€ and €, which is achieved by right-multiplication 
by IosIo3, 


Yr > Yprlo} Io? = —r-rét+rinygJI +e 
= —r-r(r" er! E> (r’) nyo J cz €), (10.224) 


where r’ = r/(rr). Dilations in the origin are performed in a similar manner, this 
time by scaling € and € through opposite amounts. This is successfully achieved 
by the two-particle map induced by equation (10.199), 


bp Wh = peel og + 23), (10.225) 


Special conformal transformations are also handled in the obvious way as the 
two-particle extension of the Ka operator of equation (10.200). This completes 
the description of the conformal group in the two-particle spacetime algebra 
setting. 

Conformal spacetime geometry can be formulated in an entirely ‘quantum’ 
language in terms of multiparticle states built from spinor representations of the 
conformal group. This link between multiparticle quantum theory and confor- 
mal geometry is quite remarkable, and is the basis for the twistor programme. 
But one obvious question remains — is this abstract quantum-mechanical for- 
mulation necessary, if all one is interested is the conformal geometric algebra 
of spacetime? If the twistor programme is simply a highly convoluted way of 
discussing conformal geometric algebra, then the answer is no. The question is 
whether there is anything more fundamental about the quantum framework of 
the twistor approach. 

Advocates of the twistor program would argue that the route we have followed 
here, which embeds a twistor within the spacetime algebra, reverses the logic 
which initially motivates twistors. The idea is that they exist at a pre-metric 
level, so that the spacetime interval between points emerges from a particular 
two-particle quantum inner product. This hints at a route to a quantum theory of 
gravity, where distance becomes a quantum observable. But much of the initial 
promise of this work remains unfulfilled, and twistors are no longer the most 
popular candidate for a quantum theory of gravity. For classical applications 
to real spacetime geometry it does appear that all twistor methods have direct 
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counterparts in the geometric algebra G(2,4), and the latter approach avoids 
much of the additional formal baggage required when employing twistors. 


10.8 Notes 


The authors would like to thank Joan Lasenby for her help in writing this chapter. 
The subjects discussed in this chapter range from the foundations of algebraic 
geometry, dating back to the nineteenth century and before, through to some 
very modern applications. An excellent introduction to geometry is the book 
Geometry by Brannan, Esplen & Gray (1999). Projective geometry is described 
in the classic text by Semple & Kneebone (1998), and Lie sphere geometry is de- 
scribed by Cecil (1992). A valuable tool for studying two-dimensional geometry 
is the software package Cinderella, written by Richter-Gebert and Kortenkamp. 
This package was used to produce a number of the illustrations in this chapter. 

The geometric algebra formulation of projective geometry is described in the 
pair of important papers ‘The design of linear algebra and geometry’ by Hestenes 
and ‘Projective geometry with Clifford algebra’ by Hestenes & Ziegler (both 
1991). These papers also include preliminary discussions of conformal geometry, 
though the approach is different to that taken here. Projective geometry is 
particularly relevant to the field of computer graphics, and some applications 
of geometric algebra in this area are discussed in the papers by Stevenson & 
Lasenby (1998) and Perwass & Lasenby (1998). 

The systematic study of conformal geometry with geometric algebra was only 
initiated in the 1990s and is one of the fastest developing areas of current re- 
search. Some of the earliest developments are contained in Clifford Algebra to 
Geometric Calculus by Hestenes & Sobczyk (1984), and in the paper ‘Distance 
geometry and geometric algebra’ by Dress & Havel (1993), which emphasises 
the role of the conformal metric. Uncovering the roles of the various geometric 
primitives in conformal space was initiated by Hestenes (2001) in the paper ‘Old 
wine in new bottles: a new algebraic framework for computational geometry’ 
and is described in detail in the papers by Hestenes, Li & Rockwood (1999a,b). 
Applications to the study of surfaces are described in the paper ‘Surface evolu- 
tion and representation using geometric algebra’ by Lasenby & Lasenby (2000b), 
and a range of further applications are discussed in the proceedings of the 2001 
conference Applications of Geometric Algebra in Computer Science and Engi- 
neering (Dorst, Doran & Lasenby, 2002). The rapid development of the subject 
has meant that a consistent notation is yet to be established by all authors. 

The unification of Euclidean and non-Euclidean geometry in the conformal 
framework is also described in the series of papers by Hestenes, Li & Rockwood 
(1999a,b) and in a separate paper by Li (2001). The development in this chap- 
ter goes further than these papers in giving a concrete realisation of traditional 
methods within the geometric algebra framework. Twistor techniques are de- 
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scribed in volume II of Spinors and Space-time by Penrose & Rindler (1986). A 
preliminary discussion of how twistors are incorporated into spacetime algebra is 
contained in the paper ‘2-spinors, twistors and supersymmetry in the spacetime 
algebra’ by Lasenby, Doran & Gull (1993b). The multiparticle description of 
conformal vectors is discussed in the paper ‘Applications of geometric algebra 
in physics and links with engineering’ by Lasenby & Lasenby (2000a). Due to a 
printing error all dot products in this paper appear as deltas, though once one 
knows this the paper is readable! 


10.1 


10.2 


10.3 


10.4 


10.5 


10.9 Exercises 


Let A, B, C, D denote four points on a line, and write their cross ratio 
as (ABCD). Given that (ABCD) = k, prove that 


(BACD) = (ABDC) = 1/k 


and 


(ACBD) = (DBCA) =1-k. 


Prove that the cross ratio of four collinear points is a projective invariant, 
regardless of the size of the space containing the line. 

Given four points in a plane, no three of which are collinear, prove that 
there exists a projective transformation that maps these to any second 
set of four points, where again no three are collinear. 

The vectors a,b,c,a’,b’,c’ all belong to G(3,0). From these we define 
the bivectors 


A=bAc, B=cAa, C=aJb, 
with the same definitions holding for A’, B’,C’. Prove that 
(Ax A’ Bx B'CxC’) = (aAbAca’ AU AC) land bAb' crc’). 


This proves Desargues’ theorem for two triangles in a common plane. 
Does the theorem still hold in three dimensions when the triangles lie 
on different planes? 

Given six vectors a1,...,@g representing points in the projective plane, 
prove that 


a5 ^a4^a3 ag^a2^a1 A543 A621 


a5 ^a ^a agA\a2\a4 A513 A624 


where A;jķk is the area of the triangle whose vertices are described pro- 
jectively by the vectors a;, aj, ap. How does this ratio of areas transform 
under a projective transformation? 
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10.6 


10.7 


10.8 


10.9 


10.10 


10.11 


10.12 


A Mobius transformation in the complex plane is defined by 


, az+b 
z= z =—_, 
cz+d 
where a,b,c,d are complex numbers. Prove that, viewed as a map of 
the complex plane onto itself, a Möbius transformation is a conformal 
transformation. Can all conformal transformations in the plane be rep- 
resented as Möbius transformations? If not, which operation is missing? 
Find the general form of the rotor, in conformal space, for a rotation 
through @ in the a^b plane, about the point with position vector a. 
A special conformal transformation in Euclidean space corresponds to a 
combination of an inversion in the origin, a translation by b and a further 
inversion in the origin. Prove that the result of this can be written 
= 1 
S= epg 
Hence show that the linear function f(a) = a- Vz is given by 


TE (1+ bx)a(1 + xb) l 

(1 + 2b- x + b?r?)? 
Why does this transformation leave angles unchanged? 
Given a conformal bivector B, with B? > 0, why does this encode a 
pair of Euclidean points? Prove that the midpoint of these two points 
is described by 


C = BnB. 


Two circles in a Euclidean plane are described by conformal trivectors Lı 
and Lə. By expressing the dual vectors lı and lə in terms of the centre 
and radius of the circles, confirm directly that the circles intersect at 
right angles if 

ly -lg = 0. 


The conformal vector X denotes a point lying on the circle L, LAX = 0, 
where L is a trivector. Prove that the tangent vector T to the circle at 
X can be written 

T =(X-L)^n. 


A non-Euclidean translation along the line through X and Y is generated 
by the bivector B = Le, where 


L=XAY ^e. 


Prove that the hyperbolic angle œ which takes us from X to Y is given 
by 
X-Y 


sh(a) = 1— -A . 
oe) X-eY-e 
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10.13 


10.14 


The line element over the Poincaré disc is defined by 


ds? = 


io (dr? + r?7d6"), 
where r and @ are polar coordinates and r < 1. Prove that geodesics in 
this geometry all intersect the circle r = 1 at right angles. 

Suppose that ~ is an even element of the spacetime algebra. This is 
acted on by the following linear transformations: 


Ro(w) = Ry, 
Talh) = Y + al y35 (1 + 03), 
Ka(Y) = Y — aply35 (1 — o3), 


where R is a spacetime rotor. Prove that this set of linear transfor- 
mations generate a representation of the restricted conformal group of 
spacetime. 
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Further topics in calculus and 
group theory 


In this chapter we collect together a number of diverse algebraic ideas and tech- 
niques. The first part of the chapter deals with some advanced topics in calculus. 
We introduce the multivector derivative, which is a valuable tool in Lagrangian 
analysis. We also show how the vector derivative can be adapted to provide a 
compact notation for studying linear functions. We then extend the multivector 
derivative to the case where we differentiate with respect to a linear function. 
Finally in this part we look briefly at Grassmann calculus, which is a major 
ingredient in modern quantum field theory. 

The second major topic covered in this chapter is the theory of Lie groups. 
We provide a detailed analysis of spin groups over a real geometric algebra. By 
introducing invariant bivectors we show how both the unitary and general linear 
groups can be represented in terms of spin groups. It then follows that all Lie 
algebras can be represented as bivector algebras under the commutator product. 
Working in this way we construct the main Lie groups as subgroups of rotation 
groups. This is a valuable alternative procedure to the more common method 
of describing Lie groups in terms of matrices. Throughout this chapter we use 
the tilde symbol for the reverse, R. This avoids confusion with the Hermitian 
conjugate, which is required in section 11.4 on complex structures. 


11.1 Multivector calculus 


Before extending our analysis of linear functions in geometric algebra, we first 
discuss differentiation with respect to a multivector. Suppose that the multivec- 
tor F is an arbitrary function of some multivector argument X, F = F(X). The 
derivative of F with respect to X in the A direction is defined by 
F(X + 7A) -— F(X 
Mogae A E (11.1) 


THO i 
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where Ax B = (AB). The multivector derivative Ox is defined in terms of its 
directional derivatives by 


A= Ox = p2 AAs Ae (ejN + Ae;)*0x, (11.2) 
tLe <j 
where the {e’} are a set of frame vectors for the space of interest. The definition 
shows how the multivector derivative Ox inherits the multivector properties of 
its argument X, as well as a calculus from equation (11.1). This is the natural 
generalisation of the vector derivative V to a general multivector. 
Most of the properties of the multivector derivative follow from the result that 


Ax (XA) = Px(A), (11.3) 


where Px(A) is the projection of A onto the grades contained in X. Leibniz’s 
rule is then used to build up results for more complicated functions. We employ 
the same rules for the multivector derivative as for the vector derivative. The 
derivative acts on objects to its immediate right unless brackets are present. 
If the Ox is intended to only act on B then this is written as 0x AB, where 
the overdot denotes the multivector on which the derivative acts. For example, 
Leibniz’s rule can be written as 


Ax (AB) = 0x AB + 0x AB. (11.4) 


As an example, suppose that y is a general even element. The derivative of the 
scalar product (py) is 


Ay (eb) = By (ab) + Sylarb) = 2d. (11.5) 
For the second term we used the result that 
dy (Web) = dy bb) = H, (11.6) 


which follows from the fact that any scalar term reverses to give itself. This result 
for the derivative of (ww) can be verified rather more laboriously by expanding 
out in a basis. 


11.1.1 The vector derivative and multilinear algebra 


The derivative with respect to a vector was first introduced in chapter 6 as an 
essential component of field theory. Here we exploit the properties of the vector 
derivative in a rather different setting. Suppose that a denotes an arbitrary vec- 
tor. We write the derivative with respect a as ôa. Algebraically, this derivative 
has the properties of a vector. It is essentially the same object as the vector 
derivative, except that we are not differentiating with respect to the position 
dependence of a function. Instead we will use ô, to differentiate a variety of 
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expressions that are linear in a. Introducing the tools of calculus may appear 
unnecessary for the analysis of linear algebra, but the notation does have some 
practical advantages. Combinations of a and 0, can be used to perform contrac- 
tions and protractions without having to introduce a basis frame. For example, 
the results of section 4.3.2 can be summarised in the compact formulae 


Oaa: Ap = rAr, 

OuaN A, = (n — r)Ar, (11.7) 
OgAra = (-1)"(n — 2r)A,. 
Similarly, the vector derivative allows the trace of a linear function to be written 
simply as 

tr(f) = 0.-f(a). (11.8) 
The trace is the first of a series of scalar invariants that can be defined from 
f. These are compactly handled using the vector derivative. Suppose that 


{a1,a2,...,@,} denote a set of n independent vectors. We define the multi- 
vector variable 


air) = a1 AagA--+Aar (11.9) 
with the associated derivative 
1 
Or) = glar NOap a N --AOq,- (11.10) 
Since 
(A; A0q aA B,) = (n—1)(A,B,-), (11.11) 
it follows that 
n! n 
Or) Ar) = (n—r)ir! r)ir! = ( 7 ) . (11.12) 


We also make the further abbreviation 
Fairy) = Far) Af (a2) -Af (ar) = fer). (11.13) 
This notation allows us to write 
Aaa fa) = Aa, F(ar) = tr(f) (11.14) 
and 
Anyfin) = Anjan) det (f) = det (f). (11.15) 


These two invariants are clearly special cases of the range of invariants 0(,-f(,). 
To understand the importance of the 0(,)-f,) invariants, consider the charac- 
teristic polynomial for f. This is formed by constructing the determinant of the 
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function G(a) = f(a) — Aa, which yields 
det (G) = On) Gin) 
= On) (f(a1) — day) A (f (a2) — Naz) ^: A (flan) — Aan) 
= On) (fin) — NAfin-1)Aan +++ + (~A) "a(n)). (11.16) 


A general term in this expression goes as 


s n s 
(—A) ( > ) On): (fin-s) ^an-s41^ . Aan) = (-A) OG ase ties (11.17) 


It follows that the characteristic polynomial is simply 


n 


C(A) = X C(A)" ey fs), (11.18) 


s=0 
where O(o):f(9) = 1. This expression clearly demonstrates the significance of the 
invariant quantities Or): for). 
The Cayley-Hamilton theorem states that 


n 


XC10) fe) f° (a) =0, (11.19) 


s=0 
where f(a) denotes the r-fold application of f on a. This says that a linear 
function satisfies its own characteristic equation. The theorem can be proved 
quite generally without any assumptions about the form of f — it applies for 
any linear function, in any linear space of any dimension and signature. An 
immediate consequence is that, if e is an eigenvector of f, 


f(e) = Ae, (11.20) 


then automatically satisfies the characteristic equation. 


11.1.2 Calculus for linear functions 


As well as the ability to differentiate with respect to a multivector, it is also very 
useful to build up results for the derivative with respect to a linear function. We 
start by introducing a fixed frame {e;}, and define the scalar coefficients 


fij = ex-F(e;). (11.21) 
Now consider the derivative with respect to fij of the scalar f(b)-c. This is 
ak f(b) -c = Of,, (fixd*c!) 
= Áb. (11.22) 
Multiplying both sides of this equation by a-e; e; we obtain 
a-e; eiðs, f(b) = a-be, (11.23) 
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which assembles a frame-independent vector on the right-hand side. It follows 
that the operator a - ej eið fi; must also be frame-independent. We therefore 
define the vector-valued differential operator Okla) by 


OF (a) = ae; eifi; . (11.24) 
The essential property of Of) is 
Of(a)f(b)-¢ = a-be, (11.25) 


which simply restates equation (11.23). As with the vector derivative, Of(q) has 
the algebraic properties of a vector, which can be exploited in analysing a range 
of expressions. 

Equation (11.25), together with Leibniz’s rule, is sufficient to derive the main 
results for the O;() operator. For example, suppose that B is a bivector, and we 
construct 

Orca) (F(BAc)B) = Öka) (F(b)F(C)B) — ya) (F(C) (b) B) 
=a-bf(c):B-—a-cf(b)-B 
= f(a-(bAc))-B. (11.26) 
This extends by linearity to give 
Ofa)(f(A).B) = f(a-A)-B, (11.27) 


where A and B are both bivectors. Proceeding in this manner, we obtain the 
general formula 


Ora) (F(A) B) = X (f(a Ar) Br) 1- (11.28) 


Tr 


For a fixed grade-r multivector A,, we can now write 


Ot(ayf(Ar) = Opa) (f(Ar) Xr) Ox, 
= f(a- A.) X, Ox, 
=(n—r+1)f(a-A,). (11.29) 


This is a very powerful result. For example, suppose that for A, we take the 
pseudoscalar I. We obtain 


O(a) f(L) = f(a) det (FJI = f(a-I). (11.30) 
It follows that 
Oj(aydet (f) = det (f)! (a), (11.31) 


where we have used equation (4.152) This derivation is considerably more com- 
pact than any available to conventional matrix/tensor methods. 
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Equation (11.28) can be used to derive formulae for the functional derivative 
of the adjoint. The general result is 


kla) f (Ar) = O(a) (F(X) Ar) Ox,. 
= f(a-X,)+Ap Ox,. (11.32) 


When A is a vector, this admits the simpler form 
kca) f(b) = ba. (11.33) 


If f is a symmetric function then f = f. But this fact cannot be exploited when 
differentiating with respect to f, since f;; and fj; must be treated as independent 
variables for the purposes of calculus. 


11.2 Grassmann calculus 


For most of his lifetime, Grassmann’s work on algebra and geometry was largely 
ignored by the wider mathematical community. Today, however, Grassmann 
algebra is a fundamental ingredient in theoretical physics. Fermionic creation 
operators generate a Grassmann algebra, and Grassmann (anticommuting) vari- 
ables are important components of path-integral quantisation, supersymmetry 
and string theory. In this section we describe how the main algebraic results of 
Grassmann calculus can be formulated in a straightforward manner within geo- 
metric algebra. This reverses the standard approach, by which one progresses 
from Grassmann to Clifford algebra via quantization. 

Suppose that {¢;} are a set of n Grassmann variables, satisfying the anticom- 
mutation relations 


{i,j} = 0. (11.34) 


The Grassmann variables {¢;} are mapped into geometric algebra by introducing 
a set of n linearly independent vectors {e;}. We do not need to specify any 
properties for their inner products, though some calculations are performed more 
easily if we assume that the {e;} belong to a Euclidean algebra. The role of the 
product of Grassmann variables is taken over by the exterior product in geometric 
algebra, so we write 


Gili = ei A ej. (11.35) 


Equation (11.34) is satisfied by virtue of the antisymmetry of the exterior prod- 
uct. Any combination of Grassmann variables can now be replaced in the obvious 
manner by a multivector. 

In order for the above scheme to have computational power, we need a trans- 
lation for the Grassmann calculus introduced by Berezin. In this calculus, dif- 
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ferentiation is defined by the rules 


%—= 


Gi e ð 
OG; m bij, Gi OC; = Õij, (11.36) 
together with the graded Leibniz rule, 
2. -2h e y (pial p OF 
ag, (fide) = ac, f+ | 1) hag (11.37) 


where [f1] is the parity of fı. The parity of a Grassmann variable is determined 
by whether it contains an even or odd number of vectors. Berezin differentiation 
is handled within the algebra generated by the {e;} frame by introducing the 
reciprocal frame {ef}, and replacing 


act > e-f (11.38) 
so that 
OC; i i 
aC, > ee, = Oj. (11.39) 


The graded Leibniz rule follows from the basic identities of geometric algebra. 
For example, if fı and f2 are grade-1 and so are treated as vectors in geometric 
algebra, then the rule (11.37) simply restates the familiar result 


et (fin f2) = e- fi fa — fire’ fo. (11.40) 


Right action by a Grassmann derivative operator translates in a similar manner: 


7 
(Nae ofe. (11.41) 
The standard results for Grassmann calculus follow simply from this basic trans- 
lation scheme. 
Grassmann integration is defined to be essentially the same operation as right 
differentiation: 

— — = 
ðo ð o 
Wn Oln- OG 
The equivalent operation in geometric algebra is therefore a right-sided contrac- 
tion, as given in equation (11.38). The most important formula is that for the 

total integral 


J KOdendena dea = 10 (11.42) 


J fC aGnitena dka > (o ((F-e)-e"7t})---) e1 = (FE"), (11.43) 


where F is the multivector equivalent of f(¢) and E” is the pseudoscalar for the 
{e’} vectors, 


E” = e NeT Ael. (11.44) 
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Equation (11.43) does nothing more than pick out the coefficient of the pseudoscalar 
part of F. 
A ‘change of variables’ is performed by a linear transformation f, with 


e! = f(e,), e =F): (11.45) 
It follows that 
E! = det (f) En, E” = det (f)! E”, (11.46) 


so that a change of variables in a Grassmann multiple integral picks up a Jacobian 
factor of det (f)~1. This contrasts with the factor of det (f) for a Riemannian 
integral. In a similar manner all of the main results of Grassmann calculus can 
be derived in geometric algebra. Often these derivations are simpler, as access 
to the geometric product offers a quick route through the algebra. 


11.3 Lie groups 


In earlier chapters we saw that rotors form a continuous group, in the same way 
that rotations do. Continuous groups of this type are called Lie groups, after the 
mathematician Sophus Lie, and they play an important role in a wide range of 
subjects in physics. Lie groups contain an infinite number of elements but, like 
vector spaces, the elements can usually be written in terms of a finite number 
of parameters. For example, three-dimensional rotations can be parameterised 
in terms of the three Euler angles. The reason is that the elements of the group 
belong to a topological space — the group manifold. In two-dimensional Euclid- 
ean space all rotors correspond to phase factors, so the rotor group manifold is 
the unit circle. Every point on the circle corresponds to a distinct rotor. 

Similarly, in three dimensions rotors are built from the space of scalars and 
bivectors. The only condition they have to satisfy is that RR = 1. Suppose that 
we write 


R = zo + le, + xoleg + x3le3. (11.47) 
Then 
RR = £0? +212 +297 +237 = 1. (11.48) 


This defines a unit vector in the four-dimensional space spanned by {x0, xi}. The 
group manifold is therefore the set of unit vectors in four-dimensional space. This 
is called a 3-sphere S3 — it is the four-dimensional analogue of the surface of a 
ball. In higher dimensions the rotor group manifolds become increasingly more 
complicated. 

Since all rotations are generated by the double-sided formula RaR, both R and 
—R correspond to the same rotation. The group manifold for three-dimensional 
rotations, rather than for the rotors themselves, is therefore more complicated 
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that $°. It involves taking a 3-sphere and projectively identifying opposite 
points. The fact that the group manifold for rotors is somewhat simpler than 
that for rotations has many applications. If the orientation of a rigid body is 
described by a rotor, the configuration space for the dynamics of the rigid body 
is a 3-sphere. This is important when looking for best-fit rotations, or extrap- 
olating between two rotations to find their midpoint. The group manifold is 
also the appropriate setting for a Lagrangian treatment. This has implications 
for constructing conjugate momenta, which are essential for the transition to 
a quantum theory. Applications of this include the rotational energy levels of 
molecules, many of which can be viewed as rigid bodies. 


11.3.1 Formal definitions 


The fact that the elements of a Lie group belong to a manifold is sufficient to 
provide an abstract definition of a general Lie group. A Lie group is defined 
as a manifold, M, together with a product (x,y). Points on the manifold 
can be labelled with vectors {x,y}, which can be viewed as lying in a higher 
dimensional embedding space (as with the 3-sphere). The product (a, y) takes 
as its argument two points in the manifold, and returns a third. This encodes 
the group product. The final set of conditions apply to ¢(a, y) and ensure that 
the product has the correct group properties. These are 


(i) Closure. (x,y) EM Va,yeM. 
(ii) Identity. There exists an element e E€ M such that (e, x) = ọ(x,e) = x, 
Vee M. 
(iii) Inverse. For every element x E€ M there exists a unique element % such 
that d(x, z) = ¢(%, x) =e. 
(iv) Associativity. ollz, y), z) = (z, d(y, z)), Vz, y,z E M. 


Any manifold with a product defined on it with the preceding properties is 
called a Lie group manifold. Many of the group properties of the group can be 
uncovered by examining the properties near the identity element. The product 
then induces a Lie bracket structure on elements of the tangent space at the 
identity. The tangent space is a linear space and the vectors in this space, 
together with their bracket, form a Lie algebra. 


11.3.2 Spin groups and the bivector algebra 


The general theory of Lie groups is rather too abstract for our purposes. In- 
stead, we will adopt a different approach to the subject by concentrating on the 
properties of rotors, and their associated spin groups. The Lie algebra of a spin 
group is defined by a set of bivectors. We will establish that every Lie algebra 
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can be represented as a bivector algebra, and that every matrix Lie group can 
be represented in terms of a spin group. 

Before proceeding, we need to clarify some of the terminology for the various 
groups discussed in this chapter. We let G(p,q) denote the geometric algebra 
of a space of signature p,q, and write V for the space of grade-1 vectors. The 
orthogonal group O(p, q) is the set of all linear transformations f mapping V + V 
that preserve the inner product. That is, 


ff(a)=a VaeV. (11.49) 


Orthogonal transformations can have determinant 1 or —1. The special orthog- 
onal group SO(p,q) is the subgroup of O(p,q) of linear transformations with 
determinant 1. Orthogonal transformations can be constructed from series of 
reflections, each of which can be written as 


ar —mam"', (11.50) 


where m is a non-null vector. Reflections have determinant —1, so do not belong 
to SO(p, q). If we restrict m to be a unit vector, m? = +1, then the set of all unit 
vectors form a group under the geometric product. This is called the pin group, 


Pin(p, q). The pin group is a double-cover representation of the orthogonal group. 
The elements of the pin group all satisfy 


MM =+1 YM € Pin(p,q). (11.51) 


The elements of the pin group split into those of even grade, and those of 
odd grade. The even-grade elements form a subgroup called the spin group, 
Spin(p, qg). The spin group consists of even-grade multivectors S € G(p, q) satis- 
fying 

SaS-'eV VaeVv, SS =+H1. (11.52) 


The transformations defined by S all have determinant +1, so the spin group is 
a double-cover representation of the special orthogonal group SO(p, q). 

Rotors are elements of the spin group satisfying the further constraint that 
RR = 1. These define the rotor group, sometimes denoted Spin* (p,q). For 
rotors we have R7! = R, and their action on multivectors is defined by the 
familiar double-sided formula 


Mw RMR. (11.53) 


With the exception of rotors in G(1,1), the rotor group is a subgroup of the 
spin group consisting of elements that are connected to the identity. That is, 
all elements of the rotor group can be connected to the identity by a single 
unbroken path in the group manifold. It follows that rotors form a double-cover 
representation of the connected subgroup of SO(p, q). For Euclidean spaces the 
special orthogonal group is connected, and for these spaces there is no distinction 
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between the spin group and rotor group. In mixed signature spaces the spin 
group differs from the rotor group by the direct product with a discrete group. 
For example, the rotor group in spacetime is a representation of the group of 
proper orthochronous transformations (see section 5.4). 

In Euclidean spaces we know that all rotations can be written as the exponen- 
tial of a bivector. The natural question now is can any rotor be written as the 
exponential of a bivector? To answer this question, consider a family of rotors 
R(A), which specifies a path on the rotor group manifold. Differentiating the 
normalisation condition RR = 1 we find that 


“(RR =0=R'R+ RR, (11.54) 


where the primes denote differentiation with respect to A. Now define the set 
vectors 


alà) = R(A)aoR(A), (11.55) 

where ag is some fixed initial vector. Differentiating this expression we find that 
d ~ R F 5 

pe = R'aoR + RaoR’ = (R' R)a(A) — a(A)(R’R). (11.56) 


The quantity R! R reverses to minus itself, so can only contain terms of grade 
2, 6, 10 etc. But the commutator of R'R with any vector must return another 
vector, otherwise the derivative of a(A) would grow non-vector terms. It follows 
that R’R can only contain a bivector component. We can therefore write 


d 1 
TEO) = —2BO)RO). (11.57) 


Locally, around any rotor, we can write 
R(A + 6A) = (1 — $6 B)R(A) = exp(—6\ B/2) RQ). (11.58) 


In this way, bivectors capture all of the local information about the rotor group. 
All ‘nearby’ rotors differ by a term that is the exponential of a bivector. 
Now suppose we look for paths satisfying 


R0)=1, RA +p) = R(A)R(p). (11.59) 


The set R(A) form a one-parameter subgroup of the rotor group. For the case 
of three-dimensional rotations the interpretation of this subgroup is clear — it 
is the group of all rotations in a fixed plane. For this path we find that 


ROA +p) =-1BA+p)RA+H) 


= —} B(A)R(A)R(p). (11.60) 
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It follows that B is constant along this curve. We can therefore integrate equa- 
tion (11.57) to get 


R(A) = e 48/2, (11.61) 


This confirms that all rotors near the origin can be written as the exponential 
of a bivector. For Euclidean space it turns out that all rotors lie on a path 
described by equation (11.59) and so can be written as the exponential of a 
bivector. This is not the case in mixed signature spaces, though it does turn out 
that in Lorentzian spaces every rotor can be written as 


R(A) = te78/?. (11.62) 


It is instructive to establish the inverse result that the exponential of a bivector 
always returns a rotor. To see this, return to the one-parameter family of vectors 


a(A) = 8/2 age /?, (11.63) 


To establish that these are the result of rotations we need only establish that 
a is a vector, as the remaining properties follow automatically. Differentiating 
with respect to A, we find that 


da _ oB/2q5.B 8/2, 
aA (11.64) 
m e 8/2 (ap: B): B e7>P/2 etc. 

For every extra derivative we pick up a further inner product with the bivector 
B. It follows that every term in the Taylor series of a(A) is a vector, and the 
overall operation is grade-preserving, as it must be. We have also proved the 


following useful Taylor expansion: 


1 
eB ae8? =a+a-Bt 5(a-B) B+. (11.65) 


This series is convergent for all bivectors B. 


11.3.3 Examples of rotor groups 


The preceding definitions are illustrated neatly by the algebras G(1,1) and 
G(1,2). First suppose that yo and 7 are basis vectors for G(1,1), with yê = 1 
and 77 = —1. The spin group consists of even-grade elements, which take the 
form a+ 87170. The restriction that ww = +1 becomes 


of — 2 = +1, (11.66) 


which defines four unconnected hyperbolic curves. The rotor group consists of 
the subgroup for which a? — 8? = 1. This defines two unconnected branches 
of a hyperbola, so the rotor group in G(1,1) is not connected. For the case 
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of Euclidean spaces the scalar product (yx) is positive definite, so there is no 
difference between the spin and rotor groups, which are always connected. 

Now suppose we add a further vector 72 of negative signature, and write a 
general even element as 


R= Ro + Riyiyo + Raye70 + R37172- (11.67) 


The rotor group is specified by the single extra condition that RR = 1, which 
becomes 


(Ro)? — (Ri)? — (Ro)? + (R3)? = 1. (11.68) 
It follows that we can write 
R = cosh(a)(cos(9) + sin(9)7172) + sinh(a) (cos(¢) + sin()y1y2) V170. (11.69) 


This parameterisation confirms that the group must now be connected. Given 
an arbitrary rotor we simply find the values of the parameters (a,6,¢), then 
smoothly run them down to zero to establish a path in the group manifold that 
connects the rotor to the identity. The reason we can do this in G(1,2) but 
could not in G(1,1) is that the former contains a bivector generator of negative 
signature. This ensures that —1 is connected to the identity. Among all algebras 
G(p,q), with p+ q > 1, the algebra G(1,1) is unique in containing no bivector 
with negative square. 

While the rotor group in G(1, 2) is connected, it is straightforward to construct 
examples of rotors that cannot be written as the exponential of a bivector. For 
example, consider the rotor 


R = exp((yo +71)%2) = 1 + (Y +11): (11.70) 


While this rotor clearly is the exponential of a bivector, it is impossible to write 
the rotor —R in this way. This is why the strongest statement that can be 
made about rotors in a mixed signature space is that they can be written as 
+exp(—B/2). 


11.3.4 The bivector algebra 


The operation of commuting a multivector with a bivector is always grade- 
preserving. In particular, the commutator of a bivector with a second bivector 
produces a third bivector. That is, the space of bivectors is closed under the com- 
mutator product. This closed algebra defines the Lie algebra of the associated 
rotor group. The group is formed from the algebra by the act of exponentiation. 
The commutator of two bivectors expresses the fact that rotations do not com- 
mute. If we apply a pair of rotations, and then perform the back rotations in 
the incorrect order, the result is the new rotation 


RaR = RoR, (RoR,aR,R2) Ri Ro. (11.71) 
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Now suppose that we are working close to the identity, so that we can write 
Rae Bl? = eP2/2eB1/267B2/267B1/2, (11.72) 
Expanding the exponentials we find that 
B = B; x Bz + higher order terms. (11.73) 


This is an example of a more general result known as the Baker—Campbell- 
Hausdorff formula. This states that if 


e? = cP, (11.74) 


then we have 


C=A+B+AxB+—~(Ax(AxB)+ Bx(BxA)) +--+. (11.75) 


1 
3 
The series converges for generators of rotors sufficiently close to the identity. 
(The precise definition of ‘sufficiently close’ was clarified by Hausdorff.) 

Now suppose that we write 


Ry = exp(—AB,/2), Rə = exp(—àB2/2), (11.76) 
so that R(A) is a path in the group manifold. Equation (11.73) ensures that 
R(A) = 1 — X° Bı x Ba/2+-. (11.77) 


In the tangent space at the identity the new generator is the commutator of the 
two original bivectors. The bivector algebra must therefore be closed under the 
commutator product. This is the way in which the local structure of a rotor 
group around the identity is passed to the bivector algebra. In the abstract 
theory of Lie groups, the Lie algebra elements are acted on by the Lie bracket, 
which is antisymmetric and satisfies the Jacobi identity. For a rotor group the 
Lie bracket is simply the commutator product for bivectors. The Jacobi identity 
for the Lie algebra then reduces to the identity 


(Ax B)xC+(CxA)xB+(BxC)xA=0, (11.78) 
which holds for any three bivectors A, B and C. 


11.3.5 Structure constants and the Killing form 


Suppose now that we introduce a basis set of bivectors {B;}. The commutator of 
any pair of these returns a third bivector, which can also be expanded in terms 
of the basis set. We can therefore write 


B;x By = Ci,Bi. (11.79) 


The Cie are called the structure constants of the Lie algebra. They provide 
one of the most compact encodings of the group properties, since knowledge of 
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the bracket structure is sufficient to recover most of the properties of the group. 
The structure constants also provide a route to solving the problem of classifying 
all possible Lie algebras over the real and complex fields. The solution of this 
problem was a significant achievement, completed by the mathematician Elie 
Cartan. 

The adjoint representation of a Lie group is defined in terms of functions 
mapping the Lie algebra onto itself. Every element of a Lie group induces an 
adjoint representation through its action on the Lie algebra. For the case of rotor 
groups the Lie algebra is the bivector algebra, and the adjoint representation 
consists of a map of the form 


Bw RBR = Adp(B). (11.80) 
It is immediately clear that this representation satisfies 
Adp, (Adr, (B)) = Adpr,r,(B). (11.81) 
The adjoint representation of the group induces an adjoint representation ad 4/2 
of the Lie algebra as 
ad 4/2(B) = AxB. (11.82) 
The adjoint representation of an element of the Lie algebra can be considered as 


a linear map on the space of bivectors. The matrix corresponding to the adjoint 
representation of the basis bivector B; is defined by the structure coefficients 


(adp,)& = 2C- (11.83) 
The Killing form for a Lie algebra is defined through the adjoint representation 
as 
K(A, B) = tr(adyadg). (11.84) 
Up to an irrelevant normalisation, the Killing form for a bivector algebra is 
simply the inner product 
K(A, B) = A-B, (11.85) 
which is the definition we shall adopt. It is immediately clear that rotor groups 
in Euclidean space have a negative-definite Killing form. An algebra with a 


negative-definite Killing form is said to be of compact type, and the associated 
Lie group is compact. 


11.4 Complex structures and unitary groups 


So far we have only dealt with the properties of real rotation groups, but it turns 
out that this is sufficient for us to uncover the properties of all Lie algebras. We 
can start to see how this works by studying how complex groups fit into our real 
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geometric algebra. The ideas developed in this section are useful in a number of 
areas, particularly Hamiltonian dynamics and geometric quantum mechanics. 


11.4.1 Complex spaces 


The simplest algebraic way to define a complex structure is to introduce a com- 
muting scalar quantity j with the property j? = —1, and to add the assumption 
that all linear superpositions are now taken over the complex field. A more at- 
tractive, geometric alternative is to work in a real space of dimension 2n and 
introduce a bivector in this space to play the role of the complex structure. We 
saw in section 6.3 that complex analysis can be performed in the geometric alge- 
bra of the real two-dimensional plane with the role of the unit imaginary played 
by the unit pseudoscalar. Here we generalise this idea to an n-dimensional com- 
plex space. 

Our starting point is a real n-dimensional vector space. Suppose that this has 
some arbitrary basis {ex}, which need not be orthonormal. Now introduce a 
further set of n-vectors {f} perpendicular to the {ex}, with the properties 


fi fj = eiʻej, fire; = 0, (11.86) 
which hold for all i, j =1,...,n. From these vectors we construct the bivector 
J= enf = enf, (11.87) 

i=1 


where the {fF} are the reciprocal vectors to the {fk} frame. For this and the 
following section we assume that repeated indices are summed from 1,...,n. 
The bivector J is independent of the initial choice of frame {e;}. To see this, 
introduce a second pair of frames {e;} and {f/} related in the same manner as 
the {ex}, {fk} pair. For these we find that 


J! = en f” = ele eaf” = fe fien f” =enfi =J. (11.88) 
In particular, if the {ex} frame is chosen to be orthonormal, we find that 
J = ei fı +e2f2 +--+ enfn = Jı + J2 +: + Jn. (11.89) 


Each bivector blade J; then provides the complex structure for the ith plane. 
To understand the properties of the bivector J we first form the products 


eiJ = erej f? = fi fj F = fi (11.90) 
and 
fe J =e; fi fÍ =e. (11.91) 
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It follows that 
(eJ) J = fi J = —ei, 


(11.92) 
(fi J) J = ep J = — fi, 


and hence that 
(a-J)-J = —a, (11.93) 


for any vector a. We can now see how J will take over the role of the unit 
imaginary. For example, the analogue of phase rotations is generated by the 
bivector J, which describes a series of coupled rotations in each of the J; planes. 
A Taylor expansion then yields 
2 
e7 F¢/2qe49/? =a + ġa J+ geI: . 

= cos(ġ)a + sin(d)a-J. (11.94) 

The map a+ a- J is therefore a 7/2 rotation. Setting ¢ = 7 we also see that 


ael7/? = —e!7/2q, (11.95) 


so exp(J7/2) anticommutes with every vector in the algebra. The only multi- 
vector with this property is the pseudoscalar, so we have 


eJ7/2 = Inn, (11.96) 


where Iz, is the pseudoscalar of the 2n-dimensional algebra. 

Next we need a means of distinguishing the real and imaginary parts of a 
vector. As with the two-dimensional case, this requires picking out a preferred 
set of directions to represent the real axes. As a matter of convention we choose 
to identify these with the original {e,} vectors. A real vector a in the 2n- 
dimensional algebra can now be mapped to a set of complex coefficients {a;} as 
follows: 


ai= aei t+ ja fj. (11.97) 
The complex inner product therefore becomes 


(alb) = a'b} = (a-e' + j a f’) (b-ei — j b- fi) 
=at b-e; +a fib fi + j(a-f' b-e; — a b fi) 
= a-b + j(a^b)-J. (11.98) 
This shows that the complex inner product combines two geometrically distinct 
terms. The real part is the usual vector inner product, and it follows immediately 
that ařaž = a?. The imaginary part is an antisymmetric product formed by 


projecting the bivector a^b onto J. Antisymmetric products such as these play 
an important role in symplectic geometry and Hamiltonian mechanics. 


410 


11.4 COMPLEX STRUCTURES AND UNITARY GROUPS 


11.4.2 Unitary transformations 


We are free to consider any linear function defined over our 2n-dimensional vector 
space. However, only a subset of these can be represented by complex matrices 
— those that observe the complex structure. These transformations are linear 
over the complex field, so must satisfy 


f(aa + Ba- J) = af(a) + BF(a)-J. (11.99) 
It follows that complex linear transformations satisfy 
f(a-J) = f(a) J (11.100) 


for any vector a in the 2n-dimensional vector space. 

The study of complex linear functions now reduces to the study of functions 
satisfying the condition (11.100). For example, the matrix operation of Her- 
mitian conjugation has 


(a|f(b)) = (f"(a)|b). (11.101) 


By considering the various terms in this identity we see immediately that the 
Hermitian adjoint is the same as the familiar adjoint function f. That is, ft = f. 
This explains why it is Hermitian conjugation that is so important in analysing 
complex matrices. Similarly, suppose that a is a complex eigenvector of the 
complex function f. This implies that 


f(a) = aa + ba- J. (11.102) 
Clearly, if a satisfies this equation, then a- J satisfies 
f(a- J) = aa- J — ba. (11.103) 
It follows that a^(a-J) is an eigenbivector, with 
f(aA(a-J)) = (a? + Bala- J). (11.104) 


Next we need to establish the invariance group of the Hermitian inner product. 
This group must leave invariant both terms in equation (11.98). This includes 
the inner product a-b, which tells us that the invariance group is built from 
reflections and rotations. The fact that the linear transformations preserve the 
complex structure then ensures that the antisymmetric term is also invariant. 
To see this, suppose that f satisfies f = f7 t, together with equation (11.100). It 
follows that 


(f(a) Af(b))- J = f(a): (f(b): J) = f(a) f(b- J) = (anb): J. (11.105) 
This result can be summarised concisely as 
f(J) =J. (11.106) 


Unitary groups are therefore constructed from reflections and rotations which 
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leave J invariant. For a reflection to satisfy this constraint would require that 
the vector generator m satisfies 


mJm™! = J. (11.107) 


But this implies that m-J = 0, and hence that (m-J)-J = —m = 0. There are 
therefore no vector generators of reflections, and hence all unitary transforma- 
tions are generated by elements of the spin group. So far we have not specified 
the underlying signature, so our description applies equally to the unitary groups 
U(n) and U(p, q). These groups can be represented in terms of even multivectors 
in G(2n,0) and G(2p, 2q) respectively. 

To simplify matters, we now restrict to the Euclidean case, so we seek a rotor 
description of the unitary group U(n). The spin group and rotor group in G(2n, 0) 
are the same, so the unitary group has a double-cover representation in terms of 
rotors satisfying 

RJR = J. (11.108) 


Writing R = exp(—B/2), we see that the bivector generators of the unitary 
group must satisfy 


BxJ=0. (11.109) 


This defines a bivector representation of the Lie algebra u(n) of the unitary group 
U(n). We can construct bivectors satisfying equation (11.109) by first using the 
Jacobi identity to prove that 


((a-J)A(b-J)) x J = —(a-J)Ab+ (b- J) Aa 
= —(aAb)x J. (11.110) 


It follows that 
(aAb+ (a-J)A(b-J)) x J = 0. (11.111) 


Any bivector of the form on the left-hand side will therefore commute with J. 
Suppose now that the {e;} and {f;} are orthonormal vectors. We can work 
through all combinations of these to arrive at the bivector algebra in table 11.1. 
Establishing the closure of this algebra under the commutator product is straight- 
forward. The bivector algebra contains J, which commutes with all other ele- 
ments and is responsible for a global phase term. Removing this term defines 
the Lie algebra su(n) of the special unitary group SU(n). The analysis can be 
repeated with a different signature base space to construct a bivector represen- 
tation of the Lie algebra u(p, q). 


11.5 The general linear group 


We have seen how to represent both rotation groups and unitary groups in terms 
of spin groups. We will now see how all matrix groups can be represented by spin 
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Eig = exept fil; (¢<j=1,...,n) 
Fj = ef; — fie; (i<j=1,...,n) 
Ji = eifi (i =1,...,n) 


Table 11.1 The Lie algebra u(n). The bivectors all belong to the geometric 
algebra G(2n,0), and the vectors {e;} and {fi} form an orthonormal basis 
for this algebra. The complex structure is generated by the bivector J = 
Baek re 


groups, and hence that all possible Lie algebras can be represented as bivector 
algebras. This is a significant motivation for the treatment adopted in this 
chapter. Formulating general linear functions as rotors is achieved by working 
in a balanced algebra, generated by equal numbers of vectors with positive and 
negative square. Some of the algebraic considerations for these types of algebra 
were encountered in the discussions of spacetime and conformal geometry. 


11.5.1 The balanced algebra G(n,n) 


Suppose that the vectors {e;} span a non-degenerate space of unspecified sig- 
nature. We introduce a second frame {fk}, orthogonal to the first and with 
opposite signature, with the properties 


fifi = —ei'ej, ex f; =0. (11.112) 


The vectors {e;, fi} therefore generate the algebra G(n,n), regardless of the 
signature of the original {e;} space. We next introduce the balanced analogue 
of the complex bivector J by defining 


K=e;,Af'. (11.113) 
This has the properties that 
eK = ere; fl =—fi-f; fi = -fi (11.114) 
and 
fi K = -fi fi ej = —ei. (11.115) 
It follows that 
(aK) K =K. (K-a)=a Vaey. (11.116) 


There is therefore a crucial sign difference compared with the complex bivector J. 
This means that K does not generate a complex structure, but instead generates 
a null structure. To see this, we first form 


(a-K)? = -((a-K)-K) -a = —a’, (11.117) 
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so the vector a-K has opposite signature to a. Given a general vector a € G(n, n) 
we can define two separate null vectors by writing 


a=(a+a-K) + ¿(a-a K). (11.118) 

In this way the vector space V of G(n, 7) splits into two null spaces, V} and V_. 
Vectors in V, satisfy 

a4}: K = a} Vaz, E V4, (11.119) 

with a similar expression (with a minus sign) holding for V_. Both of the spaces 

V, and V_ are entirely null, and they are dual spaces to one another. Working 


entirely with vectors in V}, is a further way of formulating a Grassmann algebra 
within geometric algebra. 


11.5.2 Linear transformations 


We will shortly demonstrate that every linear function acting on an n-dimen- 
sional vector space, a +> f(a), can be represented in V} by a transformation of 
the form 


a, Ma,M™". (11.120) 


Here M belongs to a subgroup of the spin group for G(n,n), and a, is the image 
of a in V4 defined by 


a, =ata-K. (11.121) 


In this sense we form a double-cover representation of the general linear group. 
The relevant subgroup consists of transformations that map the subspaces V+. 
and V_ entirely within themselves. For this to hold we require that 


(Ma,M~')-K = Ma} M`}, (11.122) 
so we must have 


a, = M™ (Ma, M™')-K M 
= M~'3}(Ma,M7'K — KMa,M~')M 
=a,:(M7!KM). (11.123) 


It follows that we require M~!KM = K, or 
MK = KM. (11.124) 


As with the unitary case, M must belong to the spin group. The bivector 
generators of this group must commute with K. The Jacobi identity ensures 
that the commutator product of two bivectors that commute with K results in 
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Ei; = eie;— fif; (i<j=1,...,n) 
Fj = ef; — fie; (i<j=1,...,n) 
Ki = eifi (i=1,...,n) 


Table 11.2 The Lie algebra gl(n). The bivectors all belong to the geo- 
metric algebra G(n,n). The {e;} vectors are orthonormal with positive 
signature, and the {f;} are orthonormal with negative signature. The alge- 
bra contains the bivector K = Kı +---+ Ky, which generates the Abelian 
subgroup of global dilations. Factoring out this bivector produces the al- 
gebra sl(n). 


a third that also commutes with K. We proceed as with the unitary group and 
construct 


((a:-K)A(b-K)) x K = aA(b-K) + (a-K)Ab = (a^b) x K, (11.125) 


so that 
(anb — (a-K)A(b-K))xK =0. (11.126) 


We can again run through all combinations of the basis bivectors to obtain the 
basis for the Lie algebra of the general linear group listed in table 11.2. The 
difference in structure between the Lie algebras of the linear group and the 
unitary group is due solely to the different signatures of their underlying spaces. 

The remaining step is to give an explicit construction of a representation of 
a linear transformation as an element of the spin group. The key to this is the 
singular value decomposition of section 4.4.8. This decomposition shows that any 
n x n matrix (with non-zero determinant) can be decomposed into a positive- 
definite diagonal matrix sandwiched between two orthogonal matrices. To find 
a suitable encoding in terms of rotors, all we have to do is find representations 
of orthogonal transformations and positive dilations. 

Rotations are clearly present as they are generated by the £;; bivectors in the 
Lie algebra of table 11.2. These bivectors jointly rotate the {e;} and {fi} vectors 
by the same amount. But the orthogonal group also includes reflections, so we 
need to represent these as well. Suppose the reflection in G(p, q) is generated by 
the unit vector n, n? = 1. We define 


n=n-K, ni? = -1, (11.127) 
and consider the multivector nn. This satisfies 
nnk = 2nn-K +nKn = 2(n? +77) + Kn = Kni, (11.128) 
so the bivector does commute with K. But since 


nn(nn)~ = —1 (11.129) 
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this bivector is not a rotor. It belongs to the spin group, but not the rotor group. 
The action of nñ on vectors a} € V+ results in the vector 


-nnan = -nnan — (niiann)-K = —nan — (nan)-K, (11.130) 


where a is the original vector, in the same space as n. Since ñ is in the orthogonal 
space generated by the {fi} vectors, n anticommutes with a. Equation (11.130) 
is the required result for a reflection. The need to include reflections forces us 
to work with elements of the full spin group in G(n,n). 

The final step is to see how dilations are formulated with rotors. Suppose 
that we now require a positive dilation in the n direction. We again form the 
bivector nn, which is constructed from the Fi; and K; Lie algebra generators. 
With n} = n +ñ the equivalent of the vector n in V1, we find that 


eA eòn/2 = (cosh(\) — nñ sinh(A))(n + 7) 


=en,, (11.131) 


which is a pure dilation. Furthermore, any vector perpendicular to n has an 
image in Vy that commutes with nñ and so is unaffected by the action of the 
rotor. These are precisely the required properties of the positive dilation, which 
completes the construction. 

We now have an alternative means of representing every matrix group within 
geometric algebra. Since all Lie algebras can be represented by matrices, we have 
proved that all Lie algebras can be realised as bivector algebras. The accom- 
panying Lie group elements can then all be written as even products of unit 
vectors. This is potentially a very powerful idea. One immediate construct one 
can form this way is the tensor product of two linear functions. All one requires 
for this is a separate copy of the algebra G(n, n) for each linear operator. As with 
the multiparticle spacetime algebra construction of chapter 9, the generators of 
each space are orthogonal, so anticommute. It follows that even elements from 
either space commute. So rotors from either space can be multiplied commuta- 
tively, forming a spinor representation of the tensor product. The combined rotor 
generates the correct tensor product action on vectors in the combined space. 
The tensor product can therefore be constructed from the geometric product. 


11.6 Notes 


The multivector derivative and the use of the vector derivative in analysing 
linear functions are described in detail in the book Clifford Algebra to Geometric 
Calculus by Hestenes & Sobczyk (1984). This book also contains an elegant proof 
of the Cayley-Hamilton theorem, and details of the geometric algebra approach 
to Lie group theory. Some further material is contained in the ‘Lectures in 
geometric algebra’ by Doran et al. (1996a). 
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The basis of Grassmann calculus is described in The Method of Second Quan- 
tisation by Berezin (1966). A summary of the main results from this is contained 
in the appendices to the paper ‘Particle spin dynamics as the Grassmann variant 
of classical mechanics’ by Berezin and Marinov (1977). More recently, Grass- 
mann calculus has been extended to the field of superanalysis, as described in 
the books by Berezin (1987) and de Witt (1984). Similar themes also reappear in 
the subject of non-commutative geometry, as discussed by Connes & Lott (1990) 
and Coquereaux, Jadczyk & Kastler (1991). The geometric algebra treatment of 
Grassmann calculus was introduced in the papers ‘Grassmann calculus, pseudo- 
classical mechanics and geometric algebra’ by Lasenby, Doran & Gull (1993c) 
and ‘Grassmann mechanics, multivector derivatives and geometric algebra’ by 
Doran, Lasenby & Gull (1993b). Some additional material is contained in the 
thesis by Doran (1994). These works also show how the super-Lie bracket, and 
super-Lie algebras, can be formulated within geometric algebra. 

The subject of Lie groups is covered in an enormous range of textbooks. The 
series entitled Group Theory in Physics by Cornwell (1984a,1984b,1989) are par- 
ticularly recommended, as are the books by Georgi (1982) and Gilmore (1974). 
The subject of pin and spin groups has also been discussed widely. Thorough 
treatments can be found in the books An Introduction to Spinors and Geometry 
by Benn & Tucker (1988) and Clifford Algebras and Spinors by Lounesto (1997). 
The construction of the general linear group in terms of rotors was first described 
in the paper ‘Lie groups as spin groups’ by Doran et al. (1993). The thesis by 
Doran (1994) contains explicit constructions of a number of further Lie algebras, 
including symplectic and quaternionic algebras. 


11.7 Exercises 


11.1 The function f maps vectors to vectors in the spacetime algebra accord- 
ing to 


f(a) =a+aa-y4 Y+, 


where y+ is the null vector yo + y3. Find the characteristic equation 
satisfied by f. What are the roots of the characteristic polynomial and 
how many independent eigenvectors are there? Verify that f satisfies its 
own characteristic equation. 

11.2 Suppose that the vectors yo, yı form an orthogonal basis for a space of 
signature (1,1). Show that the linear function fi, 


fila) = —12a-70 Yo + 24-70 Y1 + 20-710 + O11 M1; 


has no symmetric square root. Similarly, show that the function fo, 
fo(a) = 8a-Yo Yo + a-o V1 +FAV Yo — a1 V1, 
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11.6 
11.7 


has two symmetric square roots, and find them both. 
The function ¢(A) is defined by 
(A) = det (exp(Af)) 
where f is a linear function. The exponential function is defined by the 
power series 
co x” > 
exp(Af)(a) = X 2f (a) 


T: 
r=0 


where f” (a) denotes the r-fold application of f and f°(a) = a. Prove that 
(A) satisfies 


do 
ax = Oa f(a) oA), 


and hence prove that 
det (exp(f)) = exp(0q-f(a)). 
Prove the following results for the functional derivative: 
Oka): f" (b) = rfia), r>1, 
Oka) (F (Ar) Br) = — (F (a): Br f= (Ar))1. 

Given a non-singular function f in Euclidean space, the function € is 
defined by 

e= }ln(ff). (E11.1) 


The logarithm can be defined either by a power series, or by diagonalising 
ff and taking the logarithm of the eigenvalues. Prove that 


kca) Eb) = f7 (a), 
f(a) Op-€? (b) = f-te(a). 
Prove that left and right-sided Grassmann derivatives commute. 


Suppose that x, y and e are unit vectors in G(4,0), with the pseudoscalar 
denoted by I. Prove that the product (x,y), where 


p(z, y) = (wey(1 + D))1, 


satisfies all the axioms of a Lie group product, with e the identity ele- 
ment. Which group does this product define? 
The multivector R is defined by 


R= -1 — (y + 71)%2, 


where {70, %1, Y2} are an orthonormal basis for G(1,2). Prove that R is 
a rotor, and that it is impossible to find a bivector B such that R = 
exp(—B/2). 
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11.10 


The vectors {e;, fi}, i = 1,...,n form an orthonormal basis for G(2n, 0). 
The Lie algebra u(n) is defined by the following bivectors: 


Eij = eiej + fil; (i<j =l; n) 
Fij = eifj — fie; (i<j =1,...,n), 
Ji = eifi- 


Prove that this algebra is closed under the commutator product. Hence 
find the structure constants of the unitary group. 

Prove that the Lie algebras su(4) and so(6) are isomorphic. Repeat the 
analysis for the case of su(2,2) and so(2,4). This latter isomorphism is 
important in the theory of twistors. 
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Lagrangian and Hamiltonian 
techniques 


The Lagrangian formulation of mechanics is popular in practically all modern 
treatments of the subject. The ideas date back to the pioneering work of Euler, 
Lagrange and Hamilton, who showed how the equations of Newtonian dynamics 
could be derived from variational principles. In these, the evolution of a sys- 
tem is viewed as a path in some parameter space. The path the system follows 
is one which extremises a quantity called the action, which is the integral of 
the Lagrangian with respect to the evolution parameter (usually time). The 
mathematics behind this approach was clear from the outset, but a thorough 
physical understanding had to wait until the arrival of quantum theory. In the 
path-integral formulation of quantum mechanics a particle is viewed as simulta- 
neously following all possible paths. By assigning a phase factor to the action for 
each path and summing these, one obtains the amplitude for a quantum process. 
The classical limit can then be understood as resulting from trajectories that 
reinforce the amplitude. In this manner classical trajectories emerge as those 
which make the action stationary. 

A closely related idea is the Hamiltonian formulation of dynamics. The advan- 
tage of this approach is that it produces a set of first-order equations, making 
it well suited to numerical methods. The Hamiltonian approach also exposes 
the appropriate geometry for classical dynamical systems, which is a symplec- 
tic manifold. The Lagrangian and Hamiltonian formulations are well suited to 
studying the role of symmetry in physics. Any symmetry present in the La- 
grangian will remain present in the equations of motion, and will produce a set 
of possible paths all related by the appropriate symmetry group. In this chapter 
we will touch on many of these ideas, and provide a number of Lagrangians for 
systems of physical interest. We also show how the method can be extended to 
the case of a multivector Lagrangian, which establishes contact with the systems 
studied in pseudoclassical mechanics. 
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12.1 The Euler—Lagrange equations 


Suppose that a system is described by the multivector variables X;, i = 1,..., n. 
(The use of multivector variables makes this derivation slightly more general 
than usually seen.) The Lagrangian L is a scalar-valued function of X; and xX, 
and possibly time, where the dot denotes the derivative with respect to time. 
The action for the system is 


te g 
s= f dt L(Xi, Xi, t), (12.1) 
tı 


and we seek the equations for a path for which the action is stationary. The 
solution to this problem is standard application of variational calculus. We 
write 


Xilt) = X? (t) + €Y¥;(t), (12.2) 


where Y; is a multivector containing the same grades as X; and which vanishes 
at the endpoints, € is a scalar, and X? represents the extremal path. It follows 
that the action must satisfy 


dS 


z|  =0 (12.3) 


e=0 


in order to ensure that X? is a stationary solution. The chain rule now gives 


n 


t 
m | at S "(Vix Ox,L + ¥j*0x,L) 
ti 


i=1 
te n d 
£ [ dt DY (Oxt = O) (12.4) 


where Ax B = (AB). This integral must equal zero for all paths Y;, from which 
we can read off the Euler-Lagrange equations in the form 


OL _ d (ab 
OX; dt \ax; 


ds 
de 


e=0 


}=o Vi=l,...,n. (12.5) 


The multivector derivative ensures that there are as many equations as there are 
grades present in the X;, which implies we have precisely the same number of 
equations as there are degrees of freedom in the system. 


12.1.1 Symmetries and conservation laws 


Suppose now that we consider a scalar-parameterised transformation of the dy- 
namical variables, so that we have 


Xj = X; (X; a). (12.6) 
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We further assume that a = 0 corresponds to the identity transformation (this 
restriction can be removed if necessary). The first-order change in X; is denoted 
by 6X;, where 

OX! 


éXi= (12.7) 


a=0 


We define the new Lagrangian 


which is obtained from L simply by replacing each of the dynamical variables by 
their transformed equivalent. The chain rule now gives 


dL! 


= 3-((6X:)+0x,b-+ (5X;)*0x,L). (12.9) 


a=0 i=1 


If we now suppose that the X; satisfy the Euler-Lagrange equations, we can 
rewrite the right-hand side as a total derivative to obtain 


dL! 


Pt £2 (6X) *d5,L). (12.10) 


a=0 i=1 


This result applies for any transformation, and can be used in a number of ways. 
If the transformation is a symmetry of the Lagrangian, then L’ is indepen- 
dent of a. In this case we immediately establish that a conjugate quantity 
is conserved. That is, symmetries of the Lagrangian produce conjugate con- 
served quantities. This is Noether’s theorem, and it is valuable for extracting 
conserved quantities from dynamical systems. The fact that the derivation of 
equation (12.10) assumed the equations of motion were satisfied means that the 
quantity is conserved ‘on-shell’. Some symmetries can also be extended ‘off-shell’, 
which becomes an important issue in quantum and supersymmetric systems. 
An important application of equation (12.10) is to the case of time translation, 


Xi(t,a) = Xi(t +a), (12.11) 
so that 
OX! ; 
i = X;. 12.12 
Bs (12.12) 


If there is no explicit time dependence in the Lagrangian, then equation (12.10) 
gives 


dL dw, 

CEES (Os L). 12.1 

dt dt x *OK, radis) 
We therefore define the conserved Hamiltonian by 

H=9_ X;»ðş L— L. (12.14) 


i=l 
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This is more often written in terms of the generalised momenta 


PHO L, (12.15) 
so that 
H=S )Xj*P,—L. (12.16) 
i=1 


The Hamiltonian gives the total energy in the system, and is conserved for 
systems with no explicit time dependence. 


12.1.2 Point particle actions 


The simplest application of the Lagrangian framework is for a particle moving in 
three dimensions in an external potential V (x). The Lagrangian is the difference 


between the kinetic and potential energies, 
mv? 


2 


where v = x. The Euler-Lagrange equations give 


L= —~V(a), (12.17) 


mo = —VV, (12.18) 


which identifies -WV with the force on a particle. The Hamiltonian is 


2 
H= 4V, (12.19) 


2m 


where p = mv. The Hamiltonian is conserved if V is independent of time. 
The relativistic action for a free point particle raises some new issues. We 
begin with the simplest form of the action, which is 


S= -m fat- 4)", (12.20) 


where the overdot denotes the derivative with respect to time t, and we work in 
units with c = 1. The momentum is 


OL Mex 
P= az” (1 — &?)1/2” (aey 


and the equations of motion state that p is constant. The Hamiltonian is 
H = pè — L = (p?+m?)'/?, (12.22) 


and is also conserved. 
The fact that the energy and momentum are dealt with differently is unsatis- 
factory from the point of view of Lorentz invariance, so we seek an alternative 
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formulation which is manifestly covariant. This can be achieved from the obser- 
vation that the action is equivalent to 


S=-—m fo (2-2), (12.23) 


where x’ = ôax(A). This integral is unchanged under a reparameterisation of 
the trajectory. By identifying A with t we recover equation (12.20), and setting 
equal to the proper time 7 we see that the action is —m times the proper 
time along the path. Variation with respect to the relativistic position £x now 
produces 


d ma’! 
a (re) =0. (12.24) 


If we now set equal to the proper time the left-hand side becomes m times the 
relativistic acceleration ù, where overdots now denote the derivative with respect 
to proper time. 

Interaction with an electromagnetic field is included through a term in —qz’-A, 
producing the action 


S= fo (—m(a'-2')*/? — qx’. A(z)). (12.25) 


Variation with respect to x now produces 


1 


—qV A(x)-2! + < (mon + qala) =0. (12.26) 


Setting À equal to the proper time, we find that 


mò = q(VA(z)-v — v:-VA(x)) 
= qF-v, (12.27) 


where F = VAA. We therefore recover the Lorentz force law, as discussed in 
section 5.5.3. 

The square root in the free-particle action of equation (12.23) is often incon- 
venient, and can be removed by the inclusion of an einbein. This is a scalar 
function e(A), which has the transformation property under reparameterisations 
that 


ol) = Pela), (12.28) 


where (XA) denotes a new parameterisation for the trajectory. The action can 
now be written in the equivalent form 


S=-3 fo (etx -x + me). (12.29) 


12.1 THE EULER-LAGRANGE EQUATIONS 


Variation of e produces 
(a! . x’) 1/2 


m 


e = 


; (12.30) 


and substitution of this back into the action recovers equation (12.23). A first- 
order form of the action can also be developed by introducing the momentum p 
and writing 

S= fo (-p-2' $ 5" = m?)). (12.31) 
Variation of e produces the constraint equation p? = m?, and variation of p 
produces x’ = ep. This ensures that e is again given by equation (12.30). Finally, 
the x variation determines 


d ma! 
a = 
p= iy (m) =0, (12.32) 


recovering the desired equation. In each of these cases interaction with an 
electromagnetic field is included through a term in —qr'- A. Moving to a 
reparameterisation-invariant formulation ensures that Lorentz covariance is man- 
ifest, but it limits the use of Hamiltonian techniques. Hamiltonians deal with 
energy, so picking out a Hamiltonian almost always implies breaking manifest 
Lorentz covariance. 


12.1.3 Rigid-body dynamics 


As a further application, consider a rigid body as discussed in section 3.4.3. The 
configuration of the body is described by the variables x(t) and R(t), where xo 
is the position of the centre of mass, and R is a three-dimensional rotor. We will 
ignore the motion of the centre of mass and concentrate on the rotational degrees 
of freedom. We also assume for simplicity that the object is freely rotating, so 
the Lagrangian is given by the rotational energy, 


L= —105-T(Qz). (12.33) 
Here Z(B) is the inertia tensor, and 
Qg = -2R'R, (12.34) 


where the dagger denotes the reverse operation in three dimensions. 

The fact that the degrees of freedom are described by a rotor presents a slight 
problem. Rotors belong to a Lie group, and so form a group manifold. The La- 
grangian is then a function defined for paths on the group manifold, which makes 
the Euler-Lagrange equations slightly more difficult to write down. There are 
two main methods of proceeding. The first is to introduce an explicit parameter- 
isation of R, such as the Euler angles, and to compute the Lagrangian in terms 


425 


LAGRANGIAN AND HAMILTONIAN TECHNIQUES 


of these. This has the disadvantage of introducing a fixed coordinate system, 
making it difficult to assemble the final equations into a coordinate-free form. 
The structure of the rotor group provides a more elegant alternative. We replace 
the rotor R by an arbitrary even element (a spinor) 7. The constraint yyt = 1 
is enforced through the inclusion of a Lagrange multiplier. This method allows 
us to use the coordinate-free apparatus of multivector calculus in the variational 
principle and leads quickly to the full set of Euler equations. 
Our Lagrangian is now 


L(y, b) = -49B -T(98) — Aydt - 1), (12.35) 


where the dynamical variable is the spinor ~, and A is a Lagrange multiplier. 
The bivector Np is determined from w by 


Qs = =p} + oly, (12.36) 
which is a bivector, as required. The Euler-Lagrange equations reduce to the 
single multivector equation 

yL — KS =0. (12.37) 

dt^ Y 
The symmetry of the inertia tensor simplifies the derivatives, and we obtain 


Oy (-39B:T(9B)) = -21 (On) yo", 


ð (-49B-I(Qp)) = 21B), (12.38) 


where we have used the results of section 11.1. After reversing, the Euler- 
Lagrange equation for w~ is simply 


© YAn) + YZ(Qz) = dv. (12.39) 


Variation with respect to the Lagrange multiplier À enforces the constraint that 
pyt = 1, which means we can now replace y with the rotor R. We therefore 
arrive at the equation 


T(QB) — ABT(Qz) = À. (12.40) 


The scalar part of this equation determines and shows that, in the absence 
of any applied couple, the rotational energy is a constant of the motion. The 
bivector part of equation (12.40) recovers the familiar equation 


T(QB) — ABxT(Qz) = 0, (12.41) 


as found in section 3.4.3. The Lagrange multiplier has avoided any need for 
handling the rotor group manifold. 
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12.2 Classical models for spin-1/2 particles 


The use of non-relativistic spinors in describing the dynamics of a rigid body 
demonstrates that spinors are not necessarily restricted to applications in quan- 
tum mechanics. This is significant in addressing the question: what is the clas- 
sical analogue of the Dirac equation? That is, what classical dynamical system 
produces the Dirac equation on quantisation? There have been many attempts 
to answer this question, and in the following sections we investigate two of them. 


12.2.1 Rotor dynamics 


For our first classical model of a fermion, we start with the Lagrangian for the 
Dirac field. Following the notation of section 8.2 this is 


L Dirac = (VyIyý ~ mob). (12.42) 


The properties of this Lagrangian are studied in detail in chapter 13. Focusing 
on the first (kinetic) term, we can write this as 


(Voly3b) = (VyIosp typy) = (IVylosy—'), (12.43) 


where J = ww is the Dirac current. The streamlines of J describe how the 
probability density flows through spacetime. To reduce to a point-particle model, 
we assume that only the derivatives along a streamline are important and that 
the density is concentrated entirely on one streamline. This streamline is then 
identified with the particle worldline, and the kinetic term becomes 


(J-Vilosb~") = (W'Io3b~"), (12.44) 


where the prime denotes the derivative with respect to some parameter along 
the worldline. Now recall from section 8.2 that a Dirac spinor decomposes into 


Y = ple!/2 R, (12.45) 


where p and @ are scalars, and R is a Lorentz rotor (a member of the connected 
subgroup of the spin group). The inverse, ~~', is therefore 


yet = peP R. (12.46) 
Substituting this parameterisation into equation (12.44), we find that 
(p Ioa!) = (R'Io3R). (12.47) 


The dynamics are now parameterised by a Lorentz rotor, as opposed to a full 
spinor. Given that the magnitude of a spinor is related to the quantum concept 
of probability density, it is sensible that the classical model should only depend 
on the rotor component. 

To complete the model we need to impose the condition that the current wyov 
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defines the tangent to the worldline. This is achieved by including a Lagrange 
multiplier to enforce the constraint that 


a’ = eRYR, (12.48) 


where e is an einbein. Finally, the mass term myw becomes simply em, where 
again the einbein ensures reparameterisation invariance. The full Lagrangian is 
now 


L(x, 2’, R, R', p,e) = (R'Io3R — p(a’ — eRyoR) — em), (12.49) 


and the action is formed by integrating this with respect to the evolution pa- 
rameter A. The p equation returns the constraint of equation (12.48), and the 
einbein e returns 


p (RoR) =m. (12.50) 


After variation we can choose the parameterisation such that e = 1, and 2’ is 
replaced by t, with dots denoting the derivative with respect to proper time 
along the worldline x(r). It follows that p-« = m. Clearly, then, we can 
identify p with the momentum. The x variation then says that the momentum 
is constant. 

The final equation requires varying R, which lies on the group manifold of the 
rotor group Spint (1,3). This variation can be performed in a number of ways. 
We could extend the technique employed for rigid-body mechanics, and relax 
the normalisation constraint so that R becomes a full spinor. The normalisation 
is then enforced by a pair of Lagrange multipliers (one each for the scalar and 
pseudoscalar terms). However, we can avoid this by returning to the original 
form of the Lagrangian in terms of y and replacing the relevant terms by 


(R'lo3R + epRyoR) = (Y Ios! + eplyorb/p), (12.51) 


where p = |u|. This form ensures that L is only dependent on the rotor 
component of vw, but still allows us to vary L with respect to w. This is easier 
than constructing the derivative on the group manifold. To proceed we need a 
pair of additional results. The first is that 


By (Mip-*) = py My, (12.52) 
which holds for any even multivector M. The second is that 
2pðyp = Oy(byodvroh) = 4yodbrv, (12.53) 
which implies that 
app = 2p". (12.54) 


The w variation now produces (after setting A equal to proper time 7) 


e x s d 
=Y thla + 2108 — 27! (pbyow)) — q, lest’) = 0. (12.55) 
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On multiplying through by ~ we obtain 
S+2pAk = 0, (12.56) 
where 
S = pIo! = RIo3R. (12.57) 


The rotor variation therefore produces an equation which states that the total 
angular momentum is conserved. This shows that the classical model has many 
of the desired features. Linear momentum is conserved, and the spin-1/2 nature 
of the particle is captured in the total angular momentum. 

The simplest solution to the equations of motion has mz = p, so that the 
particle is at rest in the p frame. The spin bivector is also constant, as one 
would expect in the absence of interaction. There are a range of further solutions, 
however, which are of interest. Suppose that we align yo with momentum, and 


write 
m 
— u ye; $ 12. 
E cosh(a) 1° te Os (12.58) 


which defines the ‘effective mass’ m*. The equations of motion are then solved 
by 
aa2/2 


E sinh(a) > (12.59) 


e7 2laam*r 
2m* : 


* 
R= ela3m Te 


x = Tcosh(a) yo 
The total angular momentum is 
35 + pAx = 4 cosh(a) Ios, (12.60) 
which is constant. This solution describes a particle rotating at angular frequency 
2m/cosh(a@) (as measured by the proper time), and with a radius of 
l; os 
n= zn sinh (a) cosh(a). (12.61) 


As qa increases, the momentum goes ‘off-shell’, and the particle can ‘borrow’ 
energy to execute a circular motion and feel out its surroundings. This model 
therefore captures some aspects of fermionic quantum mechanics, exhibiting a 
form of zitterbewegung, while still describing a point-particle trajectory. 

For many applications the model constructed here is unnecessarily compli- 
cated, and we instead choose to work with the somewhat simpler Lagrangian 


L(x, a,b," p,e) = (Y Ios} — p(x’ — edo) — empi). (12.62) 


Global phase invariance of L ensures that labb) is constant and can be set to 1. If 
the initial conditions are chosen suitably, one can also show that the 4-vector part 
of wy remains zero, and the motion reduces to that of the previous model. An 
open question is whether either of these models produces the Dirac equation on 
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quantisation. The problem is that a path-integral quantisation involves the group 
manifold of Spin” (1,3), which is non-compact. In addition, the Lagrangian is 
first order, which can give rise to complications in the path integral. 

A deficiency of the classical model is exposed when we couple the particle to 
the electromagnetic field. If we consider the phase transformation 


Rw Re? (12.63) 


then this introduces a term going as —0\¢ = —2’-(V¢) into the Lagrangian. 
Local phase invariance is therefore restored by modifying the Lagrangian to 


L(x,x', R, R',p,e) = (R'Io3R — p(x! — eRy)R) — qx'-A — em), (12.64) 


where A is the electromagnetic vector potential. The gz’: A term is the natural 
point-particle equivalent of the interaction term qJ-A in the Dirac Lagrangian. 
Variation now modifies the p equation in the expected manner to read 


p=qFż. (12.65) 


But the spin equation is not affected — we do not naturally pick up the g = 2 
behaviour for the gyromagnetic ratio of a spin-1/2 particle. This is disappointing, 
given that the A term is all that is required to guarantee that g = 2 in Dirac 
theory. The problem can be rectified by introducing a further term into the 
Lagrangian, going as 


q 5 q = 
Ene (-;" FRIoak) = (-{ Fylow 9 . (12.66) 
This modifies both the R and p equations to give 


$ =2¢Apt+ LF xS, 
g (12.67) 
p= qF- + —V F(x) S. 
2m 
These equations have the expected form for a particle with g = 2, but the value 


of the gyromagnetic ratio has been put in by hand. 


12.2.2 Pseudoclassical mechanics 


A quite different approach to the classical mechanics of a spin-1/2 particle is 
provided by pseudoclassical mechanics, which introduces the interesting new 
concept of a multivector-valued Lagrangian. We only consider the simplest case 
of a non-relativistic model. The model is motivated by the idea that the spin 
operators satisfy 


The classical analogue of these relations should have zero on the right-hand side, 
so the particle is described by a set of anticommuting Grassmann variables. 
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This argument runs contrary to the viewpoint of this book, which is that there 
is nothing at all quantum-mechanical about a Clifford algebra, but the model 
itself is interesting. We introduce a set of three Grassmann variables {¢;} and 
define the Lagrangian 


L= ZG — Deans Cee (12.69) 


where the w; are constants. Following the prescription of section 11.2 we replace 
the Grassmann variables with a set of three vectors {e;} under the exterior 
product. The Lagrangian then becomes 


L= seA& — w, (12.70) 
where 
W = ZéijkWiejek = wi (ezes) + wo(egAe1) + w3(e1^e2). (12.71) 


The Lagrangian is now a bivector, and not simply a scalar. This raises an imme- 
diate question — how can the variational principle be applied to a multivector? 
The answer is that all components of the Lagrangian must remain stationary 
under variation. Suppose that we contract L with an arbitrary bivector B to 
form the scalar (LB). Variation of this produces the Euler-Lagrange equation 


ðe, (LB) — © (8:,(LB)) = 0. (12.72) 


Treating the {e;} as vector variables, we arrive at the equation 
(èi + cijkwjek): B = 0. (12.73) 


But we must demand that this vanishes for all possible B, from which we extract 
the equation 


é; + EijkWjEk = 0. (12.74) 


This is the general method for handling multivector Lagrangians. The contrac- 
tion with any constant multivector must result in a scalar Lagrangian which 
is stationary when the equations of motion are satisfied. Equation (12.73) il- 
lustrates a further feature. For a fixed B, equation (12.73) is not sufficient to 
extract the full set of equations. It is only by allowing B to vary, and hence treat 
the Lagrangian as a bivector, that the full equations are extracted. 

To solve equation (12.74) we first establish that w is constant, 


w =0, (12.75) 


which follows immediately from the equation of motion. Next we introduce the 
reciprocal frame {ef} and write the equation of motion in the form 


é; = éw. (12.76) 
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Now suppose that we define the symmetric function g by 
3 
g(a) = 5 a-€i €i, (12.77) 
i=1 


so that g(e’) = e;. The function g is a form of metric for the non-orthonormal 
frame e;. On differentiating g(a), holding a constant, we find that 

d 3 

el = Dla (ewes t+a-eew) =wataw=0. (12.78) 
It follows that the function g is constant, even though the e; vectors vary in 
time. The motion is found by introducing the square root of g, which satisfies 


hh(a) = g(a), h=h. (12.79) 


This function is found by diagonalising g and taking the square root of the 
eigenvalues. It follows that 


6! = eet = g(e’)-e7 = h(e’)-h(e’). (12.80) 


The vectors h(e’) are therefore orthonormal, so we write 


fi=hle'), far fy = di. (12.81) 
These vectors satisfy 
f= fQ, (12.82) 
where 
Q = wi fafs + w2 fs fı + wsfife- (12.83) 


Since h(Q) = w, we see that Q is a constant bivector. It follows that the {f;} 
frame simply rotates at a constant frequency in the Q plane. The solution for 
the e; vectors is therefore 


eilt) = h7! (e7 2/2 ¢,(0)e#/2). (12.84) 


The only motion taking place in this system is that a fixed set of orthonormal 
vectors is rotating in a constant plane, and the resulting frame is then distorted 
by a constant symmetric function. A simple picture of this type is fairly typical 
of pseudoclassical systems when analysed in this manner. 


12.3 Hamiltonian techniques 


The Hamiltonian formulation of mechanics is important in a range of applica- 
tions, not least because of its superior handling of numerical issues. We start by 
forming Hamilton’s equations in local coordinates, before placing Hamiltonian 
dynamics in a more geometric setting. Suppose that a dynamical system is 
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described in terms of a Lagrangian L(q,qi,t), where the {qi} are a set of n 
coordinates for configuration space. The Euler-Lagrange equations are 


d (ƏL OL 
Fn (=) Bar (12.85) 


These equations typically result in a set of n second-order equations that re- 
late the generalised momenta to the forces in the system. The Euler-Lagrange 
equations are equivalent to the set of 2n first-order equations 


OH OH 
li = A Di = — : 12.86 
oe ù T (12.86) 
These are Hamilton’s equations. The Hamiltonian H (qi, pi, t) is given by 
H(qi,pist) = X. piġi — L(G, Git) (12.87) 
i=1 
in which the q; are expressed in terms of the p; by inverting the equations 
OL 
Di = Be: (12.88) 
Ogi 


The transformation from a Lagrangian to a Hamiltonian framework is called a 
Legendre transformation. We move from considering dynamics in n-dimensional 
configuration space to a 2n-dimensional phase space. 

If the Hamiltonian is independent of time we can immediately see that it is 
conserved. That is, H gives the conserved energy in the system. The proof is 
straightforward: 


a = (a5 + Boe) =0. (12.89) 


Phase space provides a very useful way of analysing the motion and stability of 
complicated systems. As a simple example, consider a pendulum consisting of a 
mass m attached to a rigid rod of length a. The configuration of the system is 
described by a single angle 0, and the Lagrangian is 


292 
L= 1 E ay (12.90) 
The Hamiltonian is therefore 
2 
H = 8 — mgacos(6), (12.91) 
2ma? 


and this is conserved. The trajectories of the system can be visualised in terms 
of a phase-space portrait, which plots surfaces of constant H in phase space. 
Sample trajectories are shown in figure 12.1. The figure illustrates how the 
phase portrait can capture global aspects of the system, such as the behaviour 
of the system as the energy gets close to value for which the pendulum can 
complete a full loop. 
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Figure 12.1 A phase portrait. The q coordinate represents the angle and 
p is the canonical momentum. The Hamiltonian is p? — cos(q). As H 
approaches 1 a bifurcation appears, corresponding to the energy for which 
the pendulum can complete a loop. The system is periodic, so the phase 
portrait can be thought of as wrapping up into a cylinder 


12.3.1 Symplectic geometry 


The natural setting for Hamilton’s equations is provided by symplectic geometry. 
A symplectic manifold (M, Q) consists of a 2n-dimensional manifold M together 
with a closed, non-degenerate 2-form Q. We will assume that n is finite so 
as to avoid discussion of the technicalities of infinite-dimensional spaces. We 
can analyse this structure using the apparatus of vector manifolds, described 
in section 6.5. A symplectic manifold does not have a metric structure, so we 
must take care not to employ the metric induced in the vector manifold by its 
embedding. This means we must distinguish tangent and cotangent spaces, as 
we can only apply the inner product between tangent and cotangent vectors. We 
denote the tangent space at x by TyM, and the cotangent space T* M. 

The covariant vector derivative is denoted by V, and always results in a mul- 
tivector that is intrinsic to the manifold (in section 6.5.3 this derivative was 
denoted D). The 2-form Q is a bivector field evaluated in the cotangent space. 
The statement that Q is closed is simply 


VAQ =0. (12.92) 


This is required in order that the Poisson bracket satisfies the Jacobi identity. 
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The condition that Q is non-degenerate is simply that 
Q(x)-a £0, Va #0, a € TaM. (12.93) 


If we view Q(x) -a as a linear map from T,M to T¥ž¥M, then Q being non- 
degenerate implies that the map has non-zero determinant, so is invertible. The 
inverse map is generated by a second bivector, which we label J. This second 
bivector lies in the tangent space, and can be viewed as the inverse of Q. The 


two bivectors are related by the pair of equations 
J- (Q-a)=a, VaeT,M, 
(12.94) 
Q-(J-a*) =a*, Va* € TSM. 


The properties of Q and J can be understood simply by introducing a set of local 
coordinates (p’,q') over M. In terms of these we define the tangent vectors 


ei = s, fi= z, (12.95) 
and the cotangent vectors 
é= V, P=Vë. (12.96) 
We then set 
aSa, IF aN (12.97) 
i=1 i=1 


So J and Q both have a similar structure to the complex bivector introduced in 
section 11.4. By construction, Q is clearly closed. It is also straightforward to 
verify the relations 


Q-ei = . Q. ,=—é, 
inc fi (12.98) 
Je = — fi, J-F = e;, 


which confirm that equations (12.94) are satisfied. 
The Hamiltonian H (x,t) is a scalar field defined over M, and the dynamics of 
the system are governed by the equation 


é =(VH)-J. (12.99) 


This is an identity between tangent vectors. In terms of local coordinates this 
equation becomes 


; OH OH OH OH 
Fei + E fi = | ea 1— |). J = fi -e,—, 12.100 
peit f (« aa tS) Fane ag ( ) 
where repeated indices are summed over 1,...,n. We therefore recover Hamil- 


ton’s equations in local coordinates. 
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12.3.2 Conservation theorems and the Poisson bracket 


We now restrict to the case where H is independent of time t. Suppose that a 
scalar function f(x) is defined over phase space. The evolution of this along a 
phase space trajectory x(t) is determined by 


fa=a-Vf =(VfAVE)- J. (12.101) 
It follows immediately that H = 0. A further consequence follows if H is invari- 
ant along some direction a in phase space. If we form the directional derivative 
of H we obtain 


aV H = (J-(Q-a))-VH = (Qa). (12.102) 
So if H is unchanged in the a direction we have 
(Q-a)-¢ =0, (12.103) 


so all flows are perpendicular to the cotangent vector Q-a. 

The equation for the evolution of f leads naturally to the definition of the 
Poisson bracket of a Hamiltonian system. Given two scalar fields f and g the 
Poisson bracket is defined by 


{ f(x), 9(2)} = (VFAVG)-J. (12.104) 
In terms of local coordinates this takes the more familiar form 
_ Of Og Of Og 
{f(), 9(x)} = > (3 aa (12.105) 


The geometric form neatly brings out the antisymmetry of the Poisson bracket. 
It follows, for example, that the Poisson bracket with the Hamiltonian returns 
the time development of a scalar field: 


{F,H} = (VfAVH) J= Î. (12.106) 


Poisson brackets and the Hamiltonian formulation of dynamics provide a natural 
route through to quantum mechanics, where Poisson brackets are replaced by 
operator commutation relations. 

An important property satisfied by the Poisson bracket is the Jacobi identity 


HF ghh) + {toh}, F} + (th, fh, gt = 9, (12.107) 


which is easily confirmed in terms of local coordinates. This identity links the 
Poisson bracket structure to a Lie algebra structure. The identity is satisfied by 
any symplectic manifold, as we now establish. We first write 


{f, 9} = (VA(fV9))-J = (J-V)-(fV9), (12.108) 
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which follows from the identity that VAV = 0 (every exact form is closed). The 
Jacobi identity now reduces to 


(J-V): ((VfAVg) J Vh + cyclic permutations) = 0. (12.109) 
If we define 
T=VfAVgGAVh, (12.110) 
then equation (12.109) simplifies to 
(J-V)-(J-T) = 0. (12.111) 
To simplify this equation further we employ the identity 
(BAB)-a=2BA(B.-a), (12.112) 
which holds for any bivector B and vector a. We can now write 
(J-V)-(J-T) = (J-V)(F-T) + BIAS) (VAT) (12.113) 


and the final term here vanishes as T is exact. It follows that the Jacobi identity 
reduces to the condition 

(J-V)AJ =0. (12.114) 
The final task is to demonstrate that this identity for J is equivalent to the 
statement that Q is closed. Equation (12.114) is evaluated entirely in tangent 


space. If we use 2 to map each term into the cotangent space we arrive at the 
equivalent expression 


e* AEP AV (I (Q-eq)A(Q-€8)) = 0, (12.115) 


where Greek indices are summed from 1 to 2n, with the first n covering the 
ei frame, and the second n covering the f; frame. This identity is equivalent to 


VA(e* Ae? (J (Q ea) A(Q-e8))) — enef AV] (Q:€a)A(Q-eg)) = 0, (12.116) 


where the check denotes that J is not differentiated in the second expression. 
The frame derivatives in this expression can all be shown to vanish, which leaves 


WV AD — eneb AV (J (Q ea) (Qep) + J(Q-ea)A(Q-eg)) =0. (12.117) 
This is equivalent to 


—2V^Q = 0, (12.118) 


which proves the main result. Any symplectic manifold admits a Poisson bracket 
structure that satisfies the Jacobi identity. As such, any symplectic manifold can 
form the basis for a Hamiltonian system. 
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12.3.3 The cotangent bundle 


In practice, the phase space for a Hamiltonian system is often the cotangent 
bundle of configuration space. This works as follows. Suppose that @ denotes 
configuration space. It is a manifold with potentially non-trivial topology. At 
each point q in the manifold we define the cotangent space T7Q. If qt form a set 
of local coordinates in configuration space, then Vq’ define a set of basis vectors 
for T H Q. An arbitrary cotangent vector can be written as p;Vq’, so the p; can 
be used as coordinates for TQ. Now consider the bundle of all tangent spaces, 
T*Q. This is a manifold, and a general point in T*Q is specified by the set of 
2n coordinates (q',p;). The first n of these locate the position over Q, and the 
second n locate a point within the cotangent space. The cotangent bundle T*Q 
is a symplectic manifold, with the symplectic structure defined by 


n 
Q= So Vg AVpi- (12.119) 
i=1 
The reason this structure often arises is that, while there may be constraints 
placed on configuration space, there are usually no restrictions in momentum 
space. Returning to the case of the simple pendulum, Q is a circle since 0 is a 
periodic coordinate. But there are no such constraints on Å, so the cotangent 
space is a line. The manifold T*Q can therefore be visualised as a cylinder, 
which is the phase space for the pendulum. 


12.3.4 Canonical transformations 


Suppose that (Mı, Qı) and (M2, Q2) are two symplectic manifolds. We let f 
denote a map from Mı to Mə, which we write as 


a= f(x), zEMı, 2 € Mp. (12.120) 


This map is canonical if it respects the symplectic structure. That is, we must 
have 


f(Q2) = Q1, f(J1) = Jo. (12.121) 
Here 
f(a) =a-Vf (x) (12.122) 
is a map from TyMiı to Tẹ M2. The fact that Q is non-degenerate means that 
we can define a volume form on either manifold by 
v= ("an a “(ann AQ) an: (12.123) 


The map f must preserve this volume form, so has non-zero determinant. It 
follows that f is invertible for a canonical transformation, and hence so too is f. 
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As a check, the Poisson bracket structure remains intact as 
Jy-(VfAVg) = J2 f (VFAV) = J2: (V2f^V29), (12.124) 


where V2 = f~!(V) is the vector derivative on Mz. It follows that the dynam- 
ics can be formulated on either Mı or M2, and the physical results will remain 
unchanged. This is potentially a very powerful result. The set of all possible 
symplectic transformations is large, and there may well by a suitable transforma- 
tion which can dramatically simplify the dynamics. This is particularly evident 
when one notices that symplectic transformations can mix up the position and 
momentum coordinates in one space. These transformations are richer than 
simply converting between configuration spaces. 

In some applications, phase space is simply R?”, and the bivector J is constant. 
In this case we can consider canonical transformations which map phase space 
onto itself. For these the map f is canonical if and only if 


f(J) = J. (12.125) 


Linear transformations satisfying this identity define the symplectic group. This 
can be analysed using the spin group approach developed in section 11.4. 


12.4 Lagrangian field theory 


The Lagrangian approach to classical dynamics extends to field theory, which 
can be viewed as the dynamics of systems with an infinite number of degrees of 
freedom. There are some technical issues connected with the infinite size of the 
configuration space, but we will not discuss these here. Suppose that the system 
of interest depends on a field y(x), where for simplicity we will assume that x 
is a spacetime vector. This does not restrict us to relativistic theories, as there 
is no need to restrict the Lagrangian to be Lorentz-invariant. The action is now 
defined as an integral over a region of spacetime by 


S= [te £00,9,0,2), (12.126) 


where £ is the Lagrangian density and x” are a set of fixed orthonormal coordi- 
nates for spacetime. More general coordinate systems are easily accommodated 
with the inclusion of suitable factors of the Jacobian. 

The derivation of the Euler-Lagrange equations proceeds precisely as in sec- 
tion 12.1. We assume that wo(x) represents the extremal path, satisfying the 
desired boundary conditions, and look for variations of the form 


(a) = vola) + €6(2). (12.127) 


Here $(2) is a field of the same form as y(x), which vanishes over the boundary. 
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The first-order variation in the action is (summation convention in force) 


Oo OL 
d'r 12.128 
Selom cae 7) 
The final term is ee by parts, and the boundary term vanishes. We 
therefore find that 
o oL 
dro 12.129 
Sero a a aan) are 
from which we can read off the variational equations as 
OL o OL 
=0. 12.130 
Z ee 


If more fields are present we obtain an equation of this form for each field. Our 
main applications of these equations are in the following chapters, where we 
discuss gauge theories and gravitation. Here we illustrate the equations with a 
pair of examples concerned with elastic and fluid materials. 


12.4.1 Hyperelastic materials 


The equations of continuum mechanics, which govern an elastic body, were de- 
rived in section 6.6. For certain elastic materials it is possible to obtain these 
equations from a variational principle. We follow the notation of section 6.6, so 
f is the displacement field, f(a) is the directional derivative of f, and C = ff is 
the Cauchy—Green tensor. The materials of interest here are called hyperelastic. 
These are defined by the property that, in the absence of external fields, their 
internal energy U is a function of C only. A suitable action for this system is 


S= fid: (Q p- ZOE (12.131) 


from which we can read off £. Overdots denote the derivative with respect to 
time, and the integral runs over the space of the reference copy of the body. 
The Euler-Lagrange equations are found entirely from the variation of the 
action with respect to the displacement field f. Since the Lagrangian depends 
on f through only its time and space derivatives, the Euler-Lagrange equations 


are 
0 (ƏL O OL 
- - = 0. 12.132 
na) tee Gao) : eee 
These simplify to 
ð OU 
y= - 12.1 
pe at (aan) een 
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where v = ġ is the local velocity of the body. Comparison with equation (6.319) 


tells us that we must have 
T(e’) = ee (12.134) 
aaf) i 
where {e;} is the (fixed) coordinate frame defined by the xê coordinates. To 
simplify the right-hand side we employ the derivative with respect to a linear 
function, as defined in section 11.1.2. From the definition of equation (11.24) we 
have 


o o 
— =e, 12.135 
Ife af ( ) 
The scalars fj; are defined by 
These are the components of the vector 0; f, so we can write 
o o o 
(12.137) 


(Of) OF, OEN 

The notation for the derivative with respect to f(e’) is slightly misleading, as f 
is never evaluated on e’. Instead, we have 

T(a) = OqyU(C). (12.138) 


The fact that U is a function of C = ff ensures that the second Piola—Kirchoff 
stress 7 = f~!T is a symmetric function. 

To make further progress we must specify the precise form of U, which amounts 
to specifying the constitutive properties of the system. The simplest hyperelastic 
materials to consider are isotropic and homogeneous. For these the internal 
energy can only depend on the principal stretches: 


W = W (A1, Az, A3)- (12.139) 


Even within this class there are a large variety of models one can consider. To 
obtain a linear model the energy should be quadratic in the strains, where linear 
in this context refers to the relationship between the stress and strain tensors, 
and not to the underlying dynamics. A natural model to consider is to define 
the strain by 


E(a) = C1? (a) — a, (12.140) 
and set 
U(E) = Gtr(E”) + (B/2 — G/3) tr(£)? 
= G(tr(C) — 2tr(C*?) + 3) + (B/2 — G/3)(tr(C/?) — 3)”, (12.141) 


where B and G are respectively the (constant) bulk and the shear moduli. To 
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find the stress tensor we need the derivative of tr((ff)!/*). To evaluate this we 
first write 


(ff)!/? = exp(4In(C)) = exp(En), (12.142) 
where 
En = 4 1n(C). (12.143) 
We can now make use of the result 
Ofaytr(ER) = nf ER (a) (12.144) 
to prove that 
ratr) = FE (a). (12.145) 


The stress tensor T therefore evaluates to 
T = 2G(f — FHF) + (B — 2G/3)tr(E)f-1 (Ff)? 
= #-1(Ff)!/? (2GE + (B — 2G/3)tr(E)I), (12.146) 
where | is the identity transformation. The bracketed term is the expression we 


would expect to see in a linear theory. The extra pre-factor can be understood 
in terms of a singular-value decomposition of f. We write 


f = RC, (12.147) 
where R is a rotation. We then find that 
f-t¢cl/? = RC-V2¢1/2 = R, (12.148) 
which recovers the rotation. We can now write 
T(a) = R((2GE(a) + (B — 2G/3)tr(E)a). (12.149) 


This can be understood as a linear function of €, followed by a rotation to align 
the principal axes in the reference configuration with those in the body. 

The definition of En raises the interesting prospect that this could be used as 
an alternative definition of the strain. For an isotropic, homogeneous media this 
amounts to choosing an energy density of 


Un = G((In A1)? + (In Az)? + (In.A3)*) + (B/2 — G/3) (In(A1A2A3))”. (12.150) 


This definition has the same behaviour under small deflection as the potential 
energy of equation (12.141), but differences emerge as the stresses build up. In 
essence, the logarithmic definition of energy defines a material which retains its 
elastic properties no matter what shape it is stretched into. This limits the 
application of Uj, for modelling physical objects, though it may well be of use 
in computer graphics simulations, as routines built on Uj, will not break down 
when forces become large. 
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12.4.2 Relativistic fluid dynamics 


The field equations for a relativistic fluid can be formulated in a number of 
different ways. Here we give a fairly direct derivation, albeit from a slightly 
surprising starting point. We start with the action integral 


S= [ae (—e + J-(VA) — wJ-Vn), (12.151) 


where J(x) is a spacetime current, £ is the total energy density and 7 is the 
entropy. The current can be written as 


J = pv, v =l, (12.152) 
and we assume that € is a function of p and 7 only, which we write as 
e = p(1 + e(p,n)). (12.153) 
The remaining terms À and u are Lagrange multipliers enforcing the two con- 
straints 
V-J=0, v:-Vq = 0. (12.154) 
These are two of the four equations of motion. The first constraint is that the 
current is conserved, so the total number of particles in the system is constant. 
The second constraint says that entropy is constant along the field lines of J. The 
various constraints and assumptions ensure that we are describing a relativistic 


ideal fluid. 
Variation with respect to ņ yields the equation 


SN (12.155) 
an 
and variation with respect to J produces 
Oe 
v(l+e)+ Laer = VA — uVn. (12.156) 
In the derivation of this equation we have employed the result that 
of 
O =p 12.157 
rf (p) "Ap ( ) 
Next, we define the pressure P by 
P= gee (12.158) 
Op 
so that equation (12.156) becomes 
v(e + P) = p(VA— pV 7). (12.159) 
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The final step is to remove the Lagrange multipliers by employing the constraint 
equations. First we contract equation (12.159) with v to obtain 


e+P=pu:VrAiz=J-(V)). (12.160) 
Next we differentiate equation (12.159) to obtain 
J-V(VA— wVn) = v-V(v(e + P)) + v(e + P)I-V(p7") 
=v: V (v(e + P)) + v(e + P)V-v. (12.161) 


The left-hand side is manipulated as follows: 


J-V(VA— uVn) = V(J-V A) — VIVA p5 Vi + pV -Vn 


ð sin P 
erna ea E 
an 
O P 
wpa ie 
dp p 
= VP. (12.162) 
We therefore arrive at the equation 
v-V(v(e + P)) + u(e + P)V-v = VP, (12.163) 


which describes a relativistic ideal fluid. This is more clearly seen if we introduce 
the relativistic stress-energy tensor T(a), which is defined by 


T(a) = (e + P)a-vvu — Pa. (12.164) 


The rest frame of the fluid is defined locally by v. We find that T(v) = ev, 
so that £ is the local energy density, as required. In any spacelike direction n 
perpendicular to v we have T(n) = —Pn, which shows that the local stress is 
governed by an isotropic pressure P. These are the relativistic definitions of the 
stress-energy tensor for an ideal fluid. The field equations reduce to the single 
conservation equation 

T(V) =0, (12.165) 
which expresses relativistic conservation of the stress-energy tensor. Electromag- 
netic coupling is included simply with the addition of the term —qJ-A to the 
Lagrangian density. 


12.5 Notes 


The Lagrangian formulation of mechanics is described in a wide range of books. 
Analytical Mechanics by Hand & Finch (1998) contains a detailed introduction 
and, despite its name, Introduction to Mechanics and Symmetry by Marsden & 
Ratiu (1994) contains a more detailed description of Lagrangian and Hamiltonian 
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methods and symplectic geometry. Further applications, including relativistic 
fluid dynamics, are contained in The Variational Principles of Dynamics by 
Kupershmidt (1992). 

Pseudoclassical mechanics was introduced by Berezin & Marinov (1977). Fur- 
ther references are contained in the notes to chapter 11. Similar ideas to those 
developed in this chapter have been applied in the supersymmetric setting by 
Heumann & Manton (2000). The section on spinor models of relativistic spin-1/2 
point particles was motivated by the initial work of Barut & Zanghi (1984). The 
description given here contains a number of refinements, many of which are also 
discussed in Doran (1994). A detailed discussion of the complexities involved 
in performing a path-integral quantisation of such systems is given by Barut & 
Duru (1989). 


12.6 Exercises 


12.1 A relativistic action for a point particle is defined by 


s= fart p-& 4 0? m?) gu'-A(x)), 


where A is an external field representing the electromagnetic vector po- 
tential. Vary S with respect to x, p and e to obtain the Lorentz force 
law. 

12.2 Prove that 


Ay(My~*) = —y MY, 

where w and M are even multivectors. 
12.3 The configuration of a rigid body is described by a rotor R. If we relax 
the normalisation of R and replace it by w, explain why we can write 
Qp as 
Qg = -2RR = -yli + ptt. 

Now define the Lagrangian 

L(w, h) = -3Q8z g T(9B), 


where Z is the inertia tensor. Find the Euler-Lagrange equation for 
variation with respect to w~. Prove that this produces the equation of 
motion J = 0, where J is the angular momentum. Why does this method 
work? 

12.4 One classical model for a spin-1/2 particle describes the motion in terms 
of a rotor R and a momentum p. The rotor determines the quantities 


t = RoR, S = RIo3R 
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12.5 


12.6 


12.7 


12.8 


12.9 


12.10 


and the equations of motion are 
p=0, pt=m, S$+2%Ap=0. 


Verify that these are solved by 
m 


—0 = myo, 
(a) 


R= elC 3M*T ae 2/2 
cosh 


P _ 
Integrate x to find the trajectory of the particle and comment on its 
properties. 
Find the equations of motion for the Lagrangian 


L= (Tosh — p(x! — eyo) — ema — ga’ -A), 


where % is a spinor and A(z) is an external electromagnetic vector po- 
tential. Comment on the form of the solutions. 
A set of vectors satisfy the equations 


ĉi + EijkWjEek = O, 


where the w; are constant. Prove that the volume element E is constant, 
where 


E = e} e2 ^63. 


The relativistic Hamiltonian for a charged particle in three dimensions 
is defined by 


H(p, x,t) = ((p— q4} + m)? + qQ, 


where 6+ A = Ay and the vector potential A is a function of æ and 
t. Find Hamilton’s equations and prove that these recover the Lorentz 
force law. 
Fill in the missing steps in the proof that a closed non-degenerate 2-form 
in a symplectic manifold guarantees that the Poisson bracket satisfies the 
Jacobi identity. 
A system is described by the Hamiltonian 
1/1 

H(p.4) = 5 (< +14) 
Find a canonical transformation which maps this onto the Hamiltonian 
for a simple harmonic oscillator. 
The total energy in a hyperelastic medium is given by 


E= [as (Ae r+). 


Prove that the energy flow per unit area perpendicular to n is given by 
f-T(n), where n is a vector in the reference configuration. 
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12.11 


12.12 


An incompressible elastic material is one for which det f = 1. The 
Mooney—Rivlin model for rubber is as an incompressible material with 
internal energy 


U = afd? + AZ +A? — 3) + B((AgA3)? + (A3å1)? + (A12)? — 3), 


where the A; denote the principal strains. Analyse the properties of this 
material under uniform pressure. What happens when two of the A; 
pass through 4!/3? 

A hyperelastic material is defined with an energy density 


Um = G((In à1)? + (In Ag)” + (In A3)*) + (B/2 — G/3) (In(A1A2A3))”. 


Prove that when this system is placed under isotropic pressure P we 
have 


—3BlnA = P. 
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Symmetry and gauge theory 


The fundamental forces of nature can all be described in terms of gauge the- 
ories. Not long after the advent of quantum theory, physicists realised that 
electromagnetic interactions arise from demanding invariance of quantum wave 
equations under local changes of phase. This idea was later extended by Yang 
and Mills, who showed how to construct theories based on more complicated, 
non-commutative Lie groups. This is the basis for the standard model of the 
electroweak and strong interactions. Around this time physicists also turned 
their attention to gravitation, and discovered that general relativity could also 
be formulated as a gauge theory. But this time there was a price to pay. The 
existence of spinor fields means that the simple geometric structure of general 
relativity has to be modified by the inclusion of a torsion field, leading to an 
Einstein-Cartan theory. For clarity, we use the term general relativity to refer 
to the theory defined by Einstein, with zero torsion and the connection given by 
the Christoffel symbol. The extended theory, with torsion present, is referred to 
as Einstein-Cartan theory. 

While gauge theory is the dominant method in particle physics, it is less 
popular as a means of analysing gravitational interactions. This is, in part, 
due to the perception that the gauge theory equations are more complicated 
than their geometric counterparts. In this and the following chapter we argue 
that this apparent complexity is a reflection of the inappropriate mathematical 
techniques typically employed when analysing the gauge theory equations. The 
spacetime algebra provides the appropriate setting for a gauge formulation of 
gravity and, applied carefully, this approach is often easier to compute with 
than the metric formulation. We demonstrate that, in the absence of torsion 
and highly esoteric topology, the gauge and metric approaches produce the same 
physical predictions. 

We begin with a discussion of symmetry in the Maxwell and Dirac theories. 
Our starting point is the field Lagrangian, which we analyse using Noether’s 
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theorem. In particular, we use this to extract the canonical energy-momentum 
tensor, which is conserved in the absence of external fields. We then turn to the 
wider subject of gauge theories, before deriving the properties of the gauge fields 
for gravitation. This chapter concludes with a derivation of the gravitational 
field equations, and a discussion of the observable quantities in the theory. For 
the source matter, observables are contained in the functional energy-momentum 
tensor, which is closely related to the canonical tensor. Applications of the field 
equations are contained in chapter 14. Throughout the present chapter various 
results and notation from chapter 11 are assumed without comment. 


13.1 Conservation laws in field theory 


In section 12.4 we derived the Euler-Lagrange equations for field theory, and 
demonstrated how to apply these to the cases of elasticity and relativistic fluid 
dynamics. In this section we concentrate on conservation theorems for La- 
grangian field theory. As all of the applications that will concern us are to 
relativistic field theory, we assume from the outset that the we are describing 
field theory in a (flat) spacetime. Given a Lagrangian density L (Ypi, ð Yi), where 


Yi, i = 1,...,n area set of multivector fields, the Euler-Lagrange equations gov- 
erning the evolution of the system are 
OL o OL 
=0, (13.1) 


where x” = 7"-x are a set of fixed orthonormal coordinates. For the applications 
of interest here the final equations can always be assembled into a frame-free 
form. Curvilinear coordinates can then be introduced to analyse these equations, 
if desired. 

To obtain a version of Noether’s theorem appropriate for field theory we follow 
the derivation of section 12.1.1. For simplicity we assume that only one field is 
present. The results are easily extended to the case of more fields by summing 
over all of the fields present. Suppose that w’(x) is a new field obtained from 
(a) by a scalar-parameterised transformation of the form 


p(x) = f(z), a), (13.2) 
with a = 0 corresponding to the identity. We again define 


_ ay! 
ae ða [a0 


(13.3) 


With £’ denoting the original Lagrangian evaluated on the transformed fields 
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we find that 
OL’ OL OL 
da nes = (oe a5 + PaO) 55 By 
ð OL 
== CE (13.4) 


This equation relates the change in the Lagrangian to the divergence of the 
current J, where 


OL 
J = yu (8) *3 (13.5) 


(3p) 
If the transformation is a symmetry of the system then £’ is independent of a. 
In this case we immediately establish that the conjugate current is conserved, 
that is, 


V-J=0. (13.6) 


Symmetries of a field Lagrangian therefore give rise to conserved currents. These 
in turn define Lorentz-invariant constants via 


Q= [ae I, (13.7) 


where J? = J-7° is the density measured in the yo frame. The fact that this is 
constant follows from 


dQ _ 3 ðJ? 3 o 


where we assume that the current J falls off sufficiently fast at infinity. The value 
of Q is constant, and independent of the spatial hypersurface used to define the 
integral. 

If the transformation involves a change in the spacetime dependence, Noether’s 
theorem does apply, but we have to be careful in defining the transformation law 
for L. Suppose that we define 


U(x) = ¥(2"), (13.9) 
where 
x’ = f(x). (13.10) 
The differential is defined in the usual way as 
f(a) =a-Vf (2). (13.11) 


The transformed action is 


= ae det (f)~1L(a(2’)) (13.12) 
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from which we see that the correct definition of the transformed Lagrangian is 
L'(w"(a)) = det (TLC). (13.13) 


This transformation law demonstrates that £ is indeed a Lagrangian density. 


13.1.1 Spacetime symmetries 


One of the most important spacetime symmetries is translational invariance. 
All fundamental theories are assumed to give rise to the same physical predic- 
tions, independent of the position of the fields in (flat) spacetime. That is, the 
background space is assumed to be homogeneous. A more careful discussion of 
this principle, and its relation to gravitation, is contained in section 13.4.1. In 
terms of the Lagrangian, this principle is encoded in the statement that all x 
dependence enters £ through the fields. In this case we can apply Noether’s the- 
orem to extract a conserved quantity, though we could proceed equally simply 
by differentiating £ directly to obtain 


aL OL 
aE NOY Gg I AONE) tao aay 
aa (evoa) f ak 


where the field equations have been assumed. We can therefore define the con- 
served current conjugate to translations by 


T(a) = lV 55 — aL. (13.15) 


This defines a linear function of a, called the canonical energy-momentum tensor. 
This is a conserved tensor if the system is invariant under translations, so 


V-T(a)=0, YVconstant a. (13.16) 


The canonical energy-momentum tensor need not be symmetric, and its adjoint 
is found to be 


7 -7. 3L 
T(a) = aTa) = avy V(b r) a. (13.17) 
(a) = A(T (a) = am Vb a 

The conservation equation for the adjoint tensor is 
T(V) =0. (13.18) 
If more than one field is present, the energy-momentum tensor is the sum of the 

individual contributions from each field. 

One can similarly define a conserved tensor conjugate to rotations. This time 
the assumption is that spacetime is isotropic, so does not contain any preferred 
directions except those defined by the fields themselves. The derivation is slightly 
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more complicated now, as the fields transform in different ways depending on 
their spins. For all cases we have 


a = ŘrR, R=e%/?, (13.19) 
and in general we can write 
ow = —B-(a4AV)w + ôB, (13.20) 


where w is a general field, and the precise form of ôg depends on the spin. The 
transformation z’ = RxR has unit Jacobian, so Noether’s theorem gives 


o 


(ac, + dpy)* (13.21) 


OL ) 
app) J 
We can therefore read off the canonical angular momentum tensor J(B), where 


J(B) = yu (—B-(2AV)4 + VET ipak 
H 


OL 
=T(a-B)+ BY konan n 
R 
This is a vector-valued linear function of the bivector B, which is conserved for 
all constant B. 
The adjoint function J(a) is often easier to work with. This evaluates to 


(13.22) 


J(a) = g (J(B)a) = T(a)Ax + S(a), (13.23) 


which is a bivector-valued linear function of the vector a. The form of J(a) 
generalises the point-particle definition of angular momentum to the field theory 
setting. The term S(a) is the canonical spin tensor, 


S(a) = 0-74 00 ( (508) 505 5 ) (13.24) 


The conservation equation for J states that 
IV) =0=T(V)Ax + T(V)Aa + §(V). (13.25) 


Since the energy-momentum tensor is also conserved, conservation of angular 
momentum reduces to the equation 


T(da)Aa + S(V) = 0. (13.26) 


So, in any homogeneous, isotropic, relativistic field theory, the antisymmetric 
part of the canonical energy-momentum tensor is a total divergence. 
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13.2 Electromagnetism 


As a first application of the preceding results we consider electromagnetism. 
The dynamical variable in electromagnetism is the vector potential A, and the 
electromagnetic Lagrangian density is 


L=iF-F-A.-J, (13.27) 


ac?) 
where F = VAA, and A couples to an external current J. An electromagnetic 
gauge transformation is defined by 
Am A+ Volz), (13.28) 
where ¢(x) is a scalar field. Gauge invariance of the Lagrangian is ensured by 
requiring that the current J is conserved. The field equation is 


a G WA (FF) a ee VM) =0, (13.29) 


Oxt! 


which simplifies to the familiar equation 
V- F=J. (13.30) 


The remaining Maxwell equation, VA F = 0, follows from the definition of F in 
terms of A. 


13.2.1 The electromagnetic energy-momentum tensor 
To calculate the free-field energy-momentum tensor, we set J = 0 and work with 
the Lagrangian density 


Lo = $ (F°). (13.31) 


1 
2 
Equation (13.15) yields the energy-momentum tensor 

T(a) = (a-VA)-F — ja(F”). (13.32) 


This expression is somewhat unsatisfactory as it stands, as it is not gauge- 
invariant. In order to find a gauge-invariant form of the energy-momentum 
tensor we write 


a-VA=a-F+V(A-a). (13.33) 
If we now employ the field equations we can write 
T(a) = F.(F-a) — a F-F +V- (A-a F). (13.34) 


The first two terms are gauge-invariant, and the final term is a total divergence. 
In most classical applications the total divergence can be ignored, as its integral 
over any finite volume results in a boundary term which can be set to zero. 
In quantum field theory the issue of how to handle gauge invariance is more 
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complicated. Typically, manifest gauge invariance is lost at the level of the 
quantum field equations, and only recovered in the physical predictions of the 
theory. With the boundary term removed, the remaining terms recover the 
familiar classical free-field electromagnetic energy-momentum tensor, 
Tem(a) = F-(F-a) — 50F-F 

= }FaF, (13.35) 
as found in section 7.2.3. This tensor is gauge-invariant, traceless and sym- 
metric. It is also equal to the functional energy-momentum tensor, defined in 
section 13.5.4. 


13.2.2 Angular momentum in electromagnetism 


The canonical angular momentum is found by considering the symmetry trans- 
formation 


A'(x) = RA(2')R, (13.36) 
with R and x’ as defined in equation (13.19). The transformation law for x 
implies that 
Vi = RVR, (13.37) 
so that the new field satisfies 
VAA = RV y A(x) R = RF(2')R. (13.38) 


It follows that the transformed free-field Lagrangian only depends on a through 
the transformed position dependence, as required for isotropy. We also find that 


6A = B-A—(B-2)-VA, (13.39) 
so equation (13.22) gives 
J(B) = (B-A — (B-x)-VA)-F + $B-x(F”). (13.40) 


As with the canonical energy-momentum tensor, the angular momentum tensor 
is not manifestly gauge-invariant. This time we write 


(B-x):-VA = (B-x) (VAA) + V(B-2)-A 
= (B-x) F + V((B-2)-A) + B-A, (13.41) 
so that 
J(B) = —((B-2)-F)-F + 5B-x(F°) —V-((B-2)-AF). (13.42) 


The final term is again a total divergence which can be ignored. We therefore 
define 


Jem(B) = —((B-2)-F)-F + $B-2(F?) = Tem(x-B), (13.43) 
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which is now manifestly gauge-invariant. The adjoint is simply 
Jem(a) = Tem (a) AT. (13.44) 


Conservation of angular momentum implies that 


V-Tem(£- B) = 0a:Tem(a-B) = (Tem(0a)Aa):B =0. (13.45) 
This holds because Tem(a) is symmetric. 

The redefinition of the energy-momentum and angular momentum tensors for 
electromagnetism removes the spin term and absorbs it directly into Tem(a)^z. 
This guarantees that the fields are gauge-invariant, but suppresses the spin-1 
nature of the electromagnetic field. For gravitational interactions the canonical 
energy-momentum and spin tensors are not as important as their functional 
equivalents. In the case of electromagnetism, the latter are guaranteed to be 
(electromagnetic) gauge-invariant, and the spin contribution does turn out to 
vanish. 


13.2.3 Conformal invariance of free-field electromagnetism 


In addition to invariance under Poincaré transformations, free-field electromag- 
netism is invariant under the full conformal group of spacetime. Conformal 
geometry is discussed in detail in chapter 10. Here we are interested in the field 
theory manifestation of conformal invariance. We start by considering an arbi- 
trary displacement, x’ = f(x). Gauge invariance tells us that A must transform 
in the same manner as V (it is a 1-form), so we define 


A’ (ax) = f(A(a’)). (13.46) 
The electromagnetic field strength therefore transforms to 
VAA' (x) = f(f-*(V)AA(2’)) = F(F(2’)), (13.47) 
where we have made use of the results 
VAf(a) =0 (13.48) 
and 
Ve= tty): (13.49) 
These formulae are derived in section 6.5.6. The transformed Lagrangian density 
is now 
L' = det (f)! (f(F(2’))f(F(2’))). (13.50) 
We therefore define a symmetry of the action integral if f satisfies 


f(A)-f(B) = det (f) A-B (13.51) 
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for any pair of bivectors A and B. This is clearly satisfied by any orthogonal 
transformation, but it is also satisfied by dilations. The Lagrangian for the 
free electromagnetic field is therefore symmetric under any displacement whose 
derivative is a local orthogonal transformation coupled with a dilation. This 
defines the conformal group. 

As a simple example, consider the dilation x’ = exp(a)«. For this transforma- 
tion Noether’s theorem gives 


OL 
wve=-a0+¥-(%, (A+ eV Ansa a | f (13.52) 


from which we extract the conserved current 
J=T(x)+ A- F =Teml(£)+ V (A-z F). (13.53) 


The final term is the divergence of a bivector so is automatically conserved. 
Dilation invariance therefore tells us that 


V-Tem(£) = 0, (13.54) 


which holds because Tem is conserved and traceless. The latter property is 
typical of scale-invariant theories. 

Similarly, a special conformal transformation maps the position vector x to 2’, 
where 


a! = f(x) = (17! + aa)™t = 2(1+aaz)~*. (13.55) 
The derivative transformation is 
f(b) =b-V f(a) = (1+ axa)*d(1 + aax)™*, (13.56) 


which is a local rotation and dilation. The determinant is 


det (f) = (1 + 2aa-a + a7a?x?)~4. (13.57) 
We also find that 
ð / 
| -ragt (13.58) 
Ja a=0 
and 
2 ig (ATI = 8a-a (13.59) 
Oa pee i : 


Noether’s theorem for special conformal transformations can then be shown to 
produce the conserved tensor Tem(xax). Conservation again follows from the 
properties of Tem. 
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13.3 Dirac theory 
The free-field Dirac Lagrangian is 
L= (Vply3h — ma), (13.60) 


where w is a spinor field. Variation with respect to w produces the Euler- 
Lagrange equation 


a eee 
(Viplya)~ — 2m + 5 yp") = 0, (13.61) 
which reverses to recover the Dirac equation in the form 
Voly3 = my. (13.62) 


This derivation departs from that given in many textbooks, as we do not consider 
w and w as independent variables. Instead we view £ as a real scalar function of 
a single field ~. An immediate consequence of the field equations is that £ = 0 
when the Dirac equation is satisfied. This behaviour is typical of first-order 
systems. 


13.3.1 Spacetime transformations 


The canonical energy-momentum tensor for the Dirac field is easily found, 


Tp(a) = yla-Velysby") — aL 


= (a-Vplys¢)1. (13.63) 
This energy-momentum tensor is not symmetric. Its adjoint is 
Tp(a) = Viblysda), (13.64) 


and the antisymmetric term is governed by the bivector 
ôa AT pla) = VA (hyh). (13.65) 
This bivector can be written as 
VAI y3b)1 = ((Vplys S Vab) 
= -19-(Uina8). (13.66) 


So, as stated in section 13.1.1, the antisymmetric component of the energy- 
momentum tensor is a total divergence. In this case we can write 


ôa ^Tpla) = —4V-5S, (13.67) 
where S is the spin trivector 


S = wlysd. (13.68) 
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Rotational invariance follows from the transformation 
W (2) = Ryle’) (13.69) 


with R and 2’ as defined in equation (13.19). The wavefunction ~ is subject to 
the single-sided transformation law appropriate for spinors. One can easily show 
that the rotors cancel out of the transformed Lagrangian, and the conjugate 
angular momentum is 


J(B) = ((-B-2)-Volysb)1 + $B- (Iys). (13.70) 
The adjoint gives 
J(a) = T(a)Ax + $a- S, (13.71) 


which neatly exposes the spin contribution to the angular momentum. Compar- 
ison with equation (12.56) confirms that the point-particle models discussed in 
section 12.2.1 do correctly capture the properties of the field angular momentum. 

The mass term in the free-field Dirac Lagrangian is the sole term breaking 
conformal invariance. Spacetime spinors have a conformal weight of 3/2, so 
dilations are defined by 


wy! (x) = 8°/2ab(e%x). (13.72) 


For this transformation, Noether’s theorem gives rise to the canonical vector 
T p(x), which satisfies the partial conservation law 


V-Tp(2) = (my). (13.73) 


Special conformal transformations are also interesting to consider. With the 
transformation as defined in equation (13.55), we write the derivative transfor- 
mation as 


Tt 
a-Va' = f(a) = —RaR, (13.74) 
p 
where 
1 
p=lt2aaetorae, R= T (13.75) 
p 
We define the transformed spinor by 
1 x 
Y (2) = p RY) = (1+ aax)? (1 + axa) tyz’). (13.76) 
p 


This transformation of ~ defines a symmetry of the action because of the re- 
markable result that 


V((1+ aax)~?(1 + axa)~*) = 0. (13.77) 
It follows that 
T ak 
Vol (2) = RV epla’), (13.78) 
pP 
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which is precisely the transformation required in the Dirac action. More gener- 
ally, a special conformal transformation can be applied to any spacetime mono- 
genic to obtain a new monogenic function. Equation (13.77) is an example of 
the general result that 


1l+azxa 
Y (a + 2ax-a + Sagat) = 0, (13.79) 


which holds in an n-dimensional space of arbitrary signature. 
The conserved tensor conjugate to special conformal transformations, Te, is 
found from Noether’s theorem to be 


T.(a) = Tp(xax) + (a^r): S. (13.80) 
The partial conservation law for this is 
V-T.(a) = 2m a- xl). (13.81) 


For both dilations and special conformal transformations we recover a genuine 
conservation law if the mass m is set to zero. This is the basis for an important 
technique in quantum field theory. In high-energy experiments it is often a 
reasonable approximation to treat the particles as massless. One can then take 
advantage of the conformal symmetry to compute a range of consequences for 
the outcome of experiment. Typically, these predictions will be valid up to order 
m/E, where E is the energy. 


13.3.2 Internal symmetries and phase invariance 


As well as spacetime symmetries there are a number of internal symmetries of 
the Dirac action we can consider. The first of these is the duality transformation 


y = ype". (13.82) 
Equation (13.4) produces the relation 
V- (dys) = (mI ph). (13.83) 


So the spin vector defines a conserved current in the massless limit. This is the 
partially-conserved axial current, which is important in scattering calculations. 
Further transformations to consider are internal rotations of the form 


y =e, (13.84) 
where B is a bivector. In this case equation (13.4) reduces to 
V: (Y B-(I73) ) =0, (13.85) 


where we have applied the Dirac equation. This yields conserved currents for 
any component of B which commutes with 73. This space is spanned by a1, O2 
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and Io3. Of these, only Jo3 has the additional property of leaving invariant the 
observable current wyow. This is the case of a phase transformation, and the 
conjugate conserved quantity is precisely the current J, so 


V-J=0, J = wryow. (13.86) 


This is an example of the general result in quantum theory that phase invari- 
ance ensures that probability density is conserved, and wavefunction evolution 
is unitary. 

The phase transformation law 


pro yl = pers (13.87) 


is a global symmetry of the Lagrangian, because ¢ is a constant. If ọ satisfies 
the Dirac equation, then so to does wy’. We arrive at a gauge theory if we convert 
this global symmetry to a local one. There are a number of reasons for believing 
that this is a sensible way to construct interactions in field theory. One moti- 
vation is from the structure of the physical statements that can be extracted 
from Dirac theory. Quantum theory makes predictions about the values of ob- 
servables, which are formed from inner products between spinors, (Y|). These 
inner products are invariant under local changes of phase. Similarly, quantum 
theory can make statements about the equality of two spinor expressions, for 
example 


Y = pı Yo. (13.88) 


This might decompose yw into two orthogonal eigenstates of some operator. 
Again, if all spinors pick up the same locally-varying phase factor then the physi- 
cal predictions are unchanged. In addition, a global change of phase corresponds 
to simultaneously changing the phase of the wavefunction everywhere in the uni- 
verse. While this can be conceived of mathematically, it does not make a great 
deal of physical sense. The ultimate motivation, however, comes from the fact 
that gauge theories are spectacularly successful. All of the known fundamental 
forces can be described by the procedure of turning a global symmetry into a 
local symmetry. 


13.3.8 Covariant derivatives and minimal coupling 


Now that we are clear on the motivation, we must find how to modify the Dirac 
equation in order that phase changes become a local symmetry. This is the 
prototype gauge theory. We start by writing 


Y = pR, (13.89) 


where R is a position-dependent rotor. We will later set R = exp(lo3¢(x)). This 
slightly more general formulation eases the transition to the more complicated 
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cases of electroweak and gravitational interactions. The equation for w’ now 
includes the term 


Vy =~ yF (3 YR + Yô, R). (13.90) 
We need to modify the V operator to be able to cancel out the term in the 


derivative of R. We therefore define a new, covariant derivative operator D, 
where 


Dy = "Dpt. (13.91) 
The directional covariant derivatives D, contain an extra term going as 
Dy = On + EYR, (13.92) 


where Q, is a multivector field whose nature and transformation properties we 
have to determine. (The factor of 1/2 is inserted for later convenience.) The 
index indicates that Q, is a linear function. We can therefore write 


Qu = O20) = OGG £), (13.93) 


which defines the linear function Q(a) = Q(a; x). The x dependence records the 
fact that the field will in general be a function of position. This label is usually 
suppressed. In later applications we will make strong use of the index-free form 
O(a). 

The behaviour we require is that under a local rotation, D should transform 
in such a way that WR is still a solution of the modified equation. So, with D 
transforming to D’, we require that 


D! (WR) = (DY)R (13.94) 


for any R. We expect that D’ should have the same functional form as D, so we 
also have 


D'y =" (Ou + ZYN). (13.95) 
Equation (13.94) therefore gives 
D! (YR) = 7" (3p R + Y3 R + FYR) 
=" (Outh + 540,) R. (13.96) 
From this we can read off that 


OR + 5 RO, = FQR, (13.97) 
which establishes the transformation law 

V, = RO, R — 2Rô,R. (13.98) 
Now R is a rotor, so 2RO,R is a member of the Lie algebra of the rotor group. It 


follows that this term is a pure bivector, so Q, must also contain a bivector term 
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if it is to cancel a term in 2RO,R. We assume that this is the only term present 
the Q, field. This is the minimal assumption, and is referred to as defining 
minimal coupling. 

The important point in this derivation is that we have used the form of the 
term —~2R0,R to say what type of object Q, is. We are not asserting that Q, 
is equal to —2R9,R. On the contrary, as will become apparent later, if Q, was 
given by the gradient of a rotor in this manner it would give rise to a vanishing 
field strength and therefore be of no physical interest. This step, of taking a term 
arising from a derivative (like -2RO,,R here), and generalizing it to a field not in 
general derivable from a derivative, is the essence of the gauging process. The Q, 
term in the covariant derivative is called a connection. In general, connections 
take their values in the Lie algebra of the associated symmetry group. Many of 
the symmetry groups we consider are rotor groups, so for these the connections 
are bivector fields. 


13.3.4 The minimally coupled Dirac equation 


Returning to electromagnetism, we are concerned with the restricted class of 
rotations that take place entirely in the y271 plane. In this case, writing R = 
exp(Io3¢), we have 


—2RO,,R = —2e 17399, pet loz = —2y,-(Vb)Io3. (13.99) 
In generalizing to Q, we see that this must take the form 
Qu = —Ay,Alo3 (13.100) 
or, in frame-free notation, 
Q(a) = —Aa- A loz. (13.101) 


Here A is a spacetime vector field, and A is some coupling constant. We now 
reassemble our full, covariant Dirac equation to obtain 


Dily3 = y” (ðu — IAY Yu: A Toz) I3 = mý. (13.102) 

This simplifies to give 
Voly3 — AAY = my, (13.103) 
and we see that the contraction between the y” frame and the connection in 
equation (13.102) assembles to give a vector multiplying w from the left. It is 


clear that for an electron we require A = 2e, so the minimally coupled Dirac 
equation is 


Vilo3 — eAy = myy, (13.104) 
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as studied in section 8.3. A local phase transformation of ~ now induces the 
transformation 


eA m eA — Vọ, (13.105) 


which we recognise as an electromagnetic change of gauge. By adding an in- 
teraction term solely in A we are making the simplest possible modification to 
the original equation, which is the essence of minimal coupling. We could, for 
example, add further terms in F, or F? multiplying Y, and the equation would 
still be gauge-invariant. It appears, however, that this possibility is not required 
for describing the fundamental forces. Why this should be so is unknown. 


13.3.5 The gauge field strength 


Now that we have introduced the gauge fields the next step is to construct the 
observable (gauge-invariant) quantities associated with them. For electromag- 
netism we know that these are the E and B fields, which form part of the field 
strength tensor. This is found in general by commuting covariant derivatives. 
We form 


[Du, Daly = Dal + 3w) — Dr (uh + 4V) 
= $0(0,Q, — 0,0, Nu xN). (13.106) 
Despite the fact that we formed commutators of derivatives on w, all of the 
derivatives of w have cancelled, and we are left with a single object 
Fav = F(p AY) = 0,07 — Opu — 0, kon. (13.107) 


This is a bivector-valued linear function of the bivector argument y, Ay. The 
construction of this object guarantees that under a change of gauge 


Fu > Fy, = RF, WR. (13.108) 


This transformation tells us that the field strength transforms covariantly under 
changes of gauge. 

Specialising to the case of electromagnetism, where Q, = —2ey,-AJo3, we 
find that the term multiplying Y contains 


(—2e)7* Fy = Ou(qw-Alo3) — On(yy-Alo3) — Yu: A% Alos x Io3 
= (Jo NYu): (VAA)Lo3 
= (w^): F Io3. (13.109) 
This is a function that maps the bivector y, ^Y, linearly onto a pure phase 
term. For most applications of electromagnetism it is sensible to lose the map- 


ping nature of the field strength and instead work directly with the bivector 
F. For more complicated gauge fields this is not appropriate. In forming the 
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commutator of covariant derivatives we have extracted the correct field strength, 
F =VAA, which encodes the physically measurable content of the electromag- 
netic field. The electromagnetic field strength is invariant under a change of 
gauge, as opposed to covariant. This is because the underlying gauge group, 
U(1), is a commutative group, so the rotors cancel out in equation (13.108). The 
picture is less simple for non-commutative Lie groups. 


13.3.6 Electroweak symmetry 


A full treatment of electroweak gauge theory requires the apparatus of quantum 
field theory, which is beyond the scope of this book. Here we give a simplified 
treatment, concentrating entirely on the fermionic sector for an electron and a 
neutrino. The left-handed particles in this sector are assembled into a doublet 


i oe (13.110) 


and the right-handed particles consist of a singlet state |e,.). The kets denote 
Dirac spinors, projected into their left-handed or right-handed states. The left- 
hand doublet is acted on by SU(2) matrices, which transform the upper and 
lower components into linear superpositions of |v.) and |e;). To construct an 
equivalent group action in spacetime algebra, we introduce the spinor pı, where 


(ie) > Yi = $ex(1 — 03) — bylo25(1 + 03). sa) 


Here Ye and Y, are the spacetime algebra equivalents of the |e;) and |v.) spinors, 
as defined by the map of equation (8.69). This map ensures that the action of 
the generators of the SU(2) group become 


ÔkLe > Wier, (13.112) 
and hence 
iLe e ~l. (13.113) 


So all transformations are now carried out on the right-hand side of yı, and are 
of the class discussed in section 13.3.2. 

The kinetic term in the Lagrangian for the left-handed doublet is usually 
written as 


LeiDLe = (Deli D\ve) + (Eli Plex), (13.114) 
which has the multivector equivalent 
Li = (Vp A — o3 yapu + Ved (1 — 3) 173). (13.115) 
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Now 
a(1— os) Iya. =-3(1- os) (13.116) 
$(1 — o3 yap = Io2$ (1+ 03) 011, 
so 
Li = —(V(ve§ (1 — 3) — Ypy Io} (1 + o3)) Iyot) 
= — (VI yo). (13.117) 


The left-handed fermionic sector of the electroweak Lagrangian is similar to the 
Dirac Lagrangian, but with y3 replaced by yo. The internal symmetry group is 
therefore defined by transformations of the form 


pre pe”, (13.118) 


where M is any even multivector that satisfies 


exp(M) 0 exp(M) = yo. (13.119) 


This picks out the set of bivectors that commute with yo, and the pseudoscalar. 
The former define an SU(2) group, and the latter is a U(1) phase term. The 
Lagrangian therefore has the expected SU(2)xU(1) symmetry of electroweak 
theory, encoded in a very natural way in the spacetime algebra. 

The right-handed sector of the electroweak theory involves a singlet state 


Vr = Vex (1 + 03). (13.120) 
The kinetic term for this is 
(Vib-Iy3br) = —(Vved (v +73) Le). (13.121) 


Mass terms are introduced via interaction with the Higgs field, which can be mod- 
elled straightforwardly as an interaction between left-handed and right-handed 
particles. A global SU(2) transformation is described by 


we wR, (13.122) 


where R is a rotor satisfying RyoR = yo. This is converted to a local symmetry 
following the procedure of section 13.3.2, which tells us that the connection 
consists of bivectors which commute with yo. The U(1) connection is a multiple 
of the pseudoscalar. The field strength is defined similarly, and one can proceed 
to model spontaneous symmetry breaking using this scheme. At some point, 
however, it is necessary to adopt a quantum field theory perspective, and replace 
the wavefunctions described here by operators acting on the quantum vacuum. 
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13.4 Gauge principles for gravitation 


We have so far described electromagnetism and electroweak forces in terms of 
gauge theories. We now turn our attention to gravity. Our aim is to model 
gravitational interactions in terms of gauge fields defined in the spacetime al- 
gebra. This initially appears to be a radical departure from general relativity, 
but in fact the two approaches converge in a manner that sheds light on the 
physical structure of the theory. Spacetime algebra is the geometric algebra of 
flat spacetime, and the introduction of fields cannot alter this basic property. 
What then are we to make of the standard arguments that spacetime is curved? 
The answer is that all of these arguments involve light paths, or measuring rods, 
or similar devices, and all of these processes are also modelled by fields. Since all 
physical quantities correspond to fields, the absolute position and orientation of 
particles or fields in our background spacetime is not measurable. It drops out 
of all physical calculations. The only predictions that can be extracted are rel- 
ative relations between fields. Ensuring that this property is true locally means 
there is no conflict with any of the principles by which one is traditionally led to 
general relativity, and naturally guides us in the direction of a gauge theory. 

To illustrate these considerations, consider possible relations between quantum 
fields. Suppose that w(x) and we(x) are spinor fields. A physical statement 
could be a simple relation of equality: 


v(x) = Yo(z). (13.123) 


But all this statement says is that at a point where one field has a particular 
value, then the second field has the same value. This statement is completely 
independent of where we choose to place the fields in the spacetime algebra. 
And, more importantly, it is totally independent of where we choose to locate 
other values of the fields. We could equally well introduce two new fields 


Wi(z) = via), pale) = palz’), (13.124) 


where wz’ is an arbitrary function of position x. The statement w(x) = ~4(a) 
contains precisely the same physical content as the original equation. 

The same picture emerges if both fields are acted on by a spacetime rotor, 
giving rise to new fields 


yi = Ryp, Y3 = Ryn. (13.125) 


Again, the statement 7, = 4 has the same physical content as the original 
equation. Similar considerations apply to the observables formed from wv, such 
as the vector J = wy. Replacing w by wv’ produces the new vector J’ = 
RJR. Invariance of the equations under this transformation ensures that the 
absolute direction of vectors in the spacetime algebra is not measurable, only 
the relative orientation of two physical vectors is measurable. We now have a 
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clear mathematical statement of the invariance properties we want to establish. 
The next task is to study the form of the gauge fields needed to enforce this 
invariance. 


13.4.1 Displacements 


We write z’ = f(x) for an arbitrary (differentiable) map between spacetime 
position vectors. The transformation we are interested in is where the field y(x) 
is transformed to the new field 


p(x) = v(2'). (13.126) 


The map f(x) should not be thought of as a map between manifolds, or as 
moving points around. The function f(x) is just a rule for relating one position 
vector to another within a single vector space. It is the fields that are trans- 
formed in this space. We need a good name for this operation of moving fields 
around. One possibility is translation, but this suggests a rigid map where all 
fields are translated by the same amount. Mathematicians favour the term dif- 
feomorphism, but this usually refers to a map between distinct manifolds. We 
prefer to use the term displacement, which does suggest the concept of moving 
a field around from one point to another in an arbitrary manner. 

The next step is to consider the behaviour of the derivative of w. With the 
displacement denoted by x’ = f(x), and the derivative defined by 


f(a) =a-Vf (2), (13.127) 
we know that the vector derivative satisfies 

Ve = ty). (13.128) 

So, for example, if y(x) is a spinor, and Y'(x) = y(x’), we have 
Vy (z) = F(Var)v(2'). (13.129) 
To formulate a version of the Dirac action that is invariant under arbitrary 
displacements, we must introduce a gauge field that removes the effect of the 
f function. This field will then assemble with the vector derivative to form an 
object which, under displacements, simply reevaluates to the derivative with 


respect to the new position vector. We construct such an object by replacing V 
with a new derivative h(V), where 


h(a) = h(a; x) (13.130) 


is a position-dependent linear function of a. We again suppress this position 
dependence where clarity permits. 
Under displacements the gauge field h must transform such that 


h (Va) = h(Vz) = hf (V2). (13.131) 
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Explicitly, the transformation law for h under displacements must be 
h'(a;2) = h(f~*(a); aly (13.132) 
or, suppressing the position dependence, 
h'(a) = hf~*(a). (13.133) 


This must hold for any arbitrary vector a. This transformation law is different 
to that encountered in the gauge theories discussed previously, as the gauge field 
acts directly on V. The h field is therefore not a connection in the conventional 
Yang-Mills sense. It is clear, however, that the h field embodies the idea of 
ensuring that a symmetry is local, so can sensibly be called a gauge field. Since 
h(a) is an arbitrary, position-dependent linear function of a, it has 4 x 4 = 16 
degrees of freedom. 

We can now systematically replace every occurrence of V with h(V), and all 
our equations will be invariant under arbitrary displacements. In particular, the 
Dirac Lagrangian density is now modified to read 


L= det (h)-} (RV) Trst z mb) . (13.134) 


This now transforms covariantly under arbitrary displacements of the fields. 

Similarly, we can consider the proper time or distance along a trajectory x()). 
In the absence of gravitational fields this is 

dx ðr? 

S= Jd |=: 13.135 

I OX OX ( ) 

Under a displacement the path transforms to f(a(A)), so the tangent vector 

transforms to 


On f (£(A)) = F(Axz). (13.136) 
We can therefore construct a gauge-invariant interval by setting 
S= fa [hE (x) ha), (13.137) 
where 
g= a (13.138) 


This distance is now invariant under displacements, so is a physically-observable 
quantity. 

We now see that tangent vectors pick up a factor of h~! and cotangent vectors 
a factor of h. Spinors are not acted on by the h function. Next we establish 
contact with more familiar constructions of general relativity. Suppose that x” 
denote an arbitrary coordinate system, with frame vectors denoted by 


e = 55 e" = Vr". (13.139) 
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In terms of this coordinate system, equation (13.137) involves the term 


Ox" Ox 
h-*(2")-h-+(e') = ——h“(e,)-h-“(e,). 13.14 
(r)a) = bey) -hMer) (13.140) 
If we define the vectors 
Ju = h~t (en), g” = h(e”). (13.141) 


then we can write the preceding term as 
Ox" Ox” 
hol 1 -ht Te a genes “Oy. 


Equation (13.137) is therefore equivalent to the line interval in general relativity 
if we set the metric equal to 


(13.142) 


Juv = Ju Jv =h (ep) h (e). (13.143) 


The gauge field h is therefore a form of square root of the metric, which allows us 
to replace the metric inner product with the inner product in the spacetime alge- 
bra. In this sense, h is closely related to the concept of a spacetime orthonormal 
tetrad or vierbein. A vierbein is obtained from the h field by defining 


y $ 
Cu = Guys 


(13.144) 
= ov, 


where both 7 and u run from 0 to 4. The advantage of working directly with 
the h field is that it frees us from any coordinate frame. Coordinate frames are 
best introduced at a later date, when the geometry of a given problem usually 
dictates the appropriate coordinate system. 

Now that we have recovered the metric, the obvious question is what has 
happened to the original flat space? It has not gone away, as all fields take their 
values over this space. In fact, there are now three distinct spaces of objects we 
can discuss. We refer to these as the tangent, cotangent and covariant spaces. 
Tangent vectors are of the form e,,. Inner products between these are not gauge- 
invariant, and hence not physically meaningful. Similarly, cotangent vectors are 
of the form of e”, and the inner product of cotangent vectors is also an unphysical 
quantity. The inner product between tangent and cotangent vectors does produce 
a gauge-invariant quantity, so can correspond to a physical observable. Tangent 
and cotangent vectors can be interchanged via the metric, which maps one space 
into the other. In frame-free form, we can write 


a* = h™th™} (a) = g(a). (13.145) 


The tangent and cotangent spaces, and the metric map between them, are the 
traditional elements of general relativity. Our third space, of covariant objects, 
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Figure 13.1 Gauge fields for gravitation. There are three vector spaces 
involved, consisting of tangent vectors xx, cotangent vectors Vo and co- 
variant fields A. The h field maps between these. The metric tensor maps 
between tangent and cotangent vectors, so is given by g = h~'h7!. Gauge- 
invariant quantities are formed from the scalar product of a tangent and 
cotangent vector, or from a pair of covariant vectors. 


is unique to the gauge theory formulation. This space consists of objects whose 
transformation law under displacements is 


¢'(z) = p(x’). (13.146) 


This defines what it means to transform covariantly under displacements. These 
include velocity vectors of the form h~!()x), gradients of the form h(V)¢, and 
spinor fields. Inner products between covariant vectors produce covariant scalars, 
which can be physically observable. 

The various fields and spaces involved are depicted in figure 13.1. The advan- 
tage of the gauge theory viewpoint, coupled with the application of spacetime 
algebra, is that we can now take full advantage of the space of covariant objects 
when analysing the gravitational field equations. This turns out to have many 
advantages, both conceptually and computationally. The possibilities afforded 
by this space have been overlooked in most treatments of gauge theory gravity. 
One immediate question posed by figure 13.1 is whether the insistence on the 
existence of a map from a curved spacetime onto a flat one has any topologi- 
cal consequences. The answer is yes, though the restrictions are not as severe 
as one might expect. Many apparently topological constructions, such as cos- 
mic strings and closed universe models, are easily handled in the gauge theory 
framework. Others, such as wormholes connecting multiple universes, do not 
fit so easily because they require a modification of the initial assumption that 
the background space is topologically flat. Models incorporating these effects 
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can be constructed, though their motivation is less clear from the gauge theory 
perspective, as aspects of the theory have to be put in by hand initially. 


13.4.2 Rotations 


Now that we have discovered the metric tensor within the gauge approach we 
could immediately write down the familiar equations of general relativity. But we 
seek a theory formulated entirely in terms of covariant vectors, and this requires 
the existence of a second gauge field. As well as invariance under displacements, 
we require that our wave equation be invariant under the transformation 


pow = Ry, (13.147) 


where R is an arbitrary, position-dependent spacetime rotor. We are now back 
in the territory of section 13.3.3, with the difference that the rotor multiplies ~ 
from the left, instead of the right. To convert 0, into a covariant derivative, we 
add a bivector connection Q, and define 


Dpt = Oy + EQ (13.148) 


The connection Q, is a position-dependent bivector, subject to the transforma- 
tion law 


Qu = O(a) = RO, R—20,RR. (13.149) 


Since R is an arbitrary rotor there is no constraint on the blades that Q, can 
contain, so Q, has 6 x 4 = 24 degrees of freedom. 

With the rotation gauge field included, the fully covariant Dirac action now 
reads, with the electromagnetic term included, 


S= fateder (M5 (ROHO + F) = eh(A) ero = mab) 

(13.150) 
The value of this action should be unchanged under local displacements and rota- 
tions. To establish this we need to complete the set of transformation properties 
for the gravitational gauge fields. First, we need to define how Q, transforms 
under displacements. For this it is easier to use the notation Q(a; x) for the lin- 
ear argument and position dependence of the connection. Since Q(a) picks up a 
term in aV RÈ under local rotations, we see that the appropriate transformation 
law under displacements is 


Q'(a; x) = (F(a); 2’). (13.151) 
The connection in the action of equation (13.150) is contracted to form the object 
h(g), = h(O_)Q(a). (13.152) 
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So under a displacement this transforms to 
h (0a) V (a) = hft (0a); x) Q(F(a); x) = h(3a; 2')Q(a; 2’), (13.153) 


which is precisely the behaviour we require. 

Similarly, we can establish the behaviour of the h field under rotations from the 
kinetic term in the covariant Dirac action. Under a local rotation this transforms 
to 


(Fi) (yb! $ AIEA = (RWO) ROY F tuy) ) (13.154) 
So under rotations we must have 
h(a) + h'(a) = Rh(a) R. (13.155) 


The same transformation law is obeyed by vectors of the form h~t (a), where a is a 
tangent vector. This guarantees that inner products between tangent and cotan- 
gent vectors are gauge-invariant, as required. The action of equation (13.150) 
now contains all of the local symmetries we require. The coupling of the electro- 
magnetic vector potential A follows from the fact that A generalises the gradient 
of a scalar, so is a cotangent vector. This is acted on by h to establish a covariant 
vector. 


13.4.8 The Dirac equation in a gravitational background 


We have so far established invariance at the level of the Dirac action, which led 
us to the action of equation (13.150). We now vary this action with respect to W, 
treating all other fields as external, to obtain the full, minimally-coupled Dirac 
equation. After reversing, variation with respect to 7 produces the equation 


h(V)ply3 + sh(y)Quplys + Ruh 
ð 


— 2eh( AJY — 2my = -zzu (det (h) “thy” )yIya)det (h). (13.156) 
This simplifies to 
h(t) (Onh + 3u YI — eh( Ayyo = my + 3tyIs, (13.157) 
where the vector t is defined by 
t = det (h)d, (det (h)~*h(y“)) + Q,-h(7“). (13.158) 


Here we encounter an initial surprise. The minimally-coupled Dirac action only 
produces the expected Dirac equation if the vector t is zero. We will establish 
the circumstances when this holds once we have discovered the full gravitational 
field equations. With t assumed to equal zero, we obtain the expected equation, 
which we write as 


DyIys — eAyyo = my. (13.159) 
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Here we introduce the notation 


Dy = Wy") Duy = W(y")(Ourh + ER) (13.160) 
and 
h(A) = A. (13.161) 


In this latter definition we begin to introduce the useful notation of writing fully 
covariant multivectors in calligraphic font. 


13.4.4 Covariant derivatives for observables 


Having established the form of the gravitational covariant derivative for a spinor, 
it is a simple matter to establish the form of the derivatives of the observables 
formed from a spinor. In general, these observables have the form 


M = yy, (13.162) 


where T is a constant multivector formed from combinations of yo, y3 and Io3. 
The observable M inherits its transformation properties from the spinor w, so 
under displacements M transforms as 


M(x) = M'(2) = M(x’) (13.163) 
and under rotations M transforms as 
M = M' = RMR. (13.164) 


Multivectors with these transformation properties are said to be (fully) covariant. 
Scalars formed from inner products of these quantities account for the physical 
observables in the theory. 

If we now form the partial derivative of M we obtain 


OM = (App) + YT (p). (13.165) 


There is no need to restrict to orthonormal coordinates, so we can take O,, as the 
derivative with respect to an arbitrary coordinate system, with coordinate frame 
{eu}. We immediately see how to construct a covariant derivative for M. We 
simply replace spinor directional derivatives with their covariant versions and 


form 
(D oE + YTD p) = OpY) + FUYE — rO, 
= Op (YTL) + Qu X (YEY), (13.166) 
where 
Oy = Qep). (13.167) 
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We therefore define the covariant derivative D, by 
D, M = ôM +Q, xM. (13.168) 


This is the form appropriate for acting on covariant multivectors, including ob- 
servables formed from spinors. The commutator with the bivector Q, has two 
important properties. The first is that it is grade-preserving, so the full D, 
operator preserves grade. The second is that 


Q,,x (AB) = (Q, x A)B + A(Q,, x B), (13.169) 
which holds for any multivectors A and B. This ensures that Da is a derivation. 
That is, it satisfies Leibniz’s rule 

D,,(AB) = (D,A)B + A(D,,B). (13.170) 
These properties of preserving grade and satisfying Leibniz’s rule are necessary 


for D,, to be a suitable generalisation of a directional derivative. 
We can assemble a full, covariant version of the vector derivative by writing 
D = h(e”)D, =9"D 4; (13.171) 
where g” = h(e”). This acts on covariant multivectors to raise and lower the 
grade by one. We can also write 
DM =D-M +DAM, (13.172) 
where M is a homogeneous-grade multivector, and 
D-M = gq": (DuM), 
9"-(PuM) (13.173) 
DAM = g" A (D, M). 


It is also sometimes convenient to write the directional covariant derivative as 
a-D, where 


a-D M = a-g "D, M. (13.174) 


We are now beginning to assemble a very powerful, compact notation for the 
main operators in gauge theory gravitation. 


13.5 The gravitational field equations 


The price we pay for ensuring that the Dirac action is invariant under local 
rotations is the introduction of two gauge fields: the vector-valued function h(a) 
and the bivector-valued Q(a). These in total have 40 degrees of freedom. Our 
next task is to construct suitable equations for these gauge fields. As with 
the Dirac equation, our ultimate goal is to formulate the equations in terms 
of covariant objects, where the physical content of the theory is clearest. The 
alternative approach is to work entirely in terms of the metric g,,. This is 
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invariant under rotations, so all reference to the rotation gauge is removed. The 
end result is a set of second-order equations that are notoriously difficult to solve. 
The gauge theory approach, with its focus on gauge-covariant objects, provides 
a number of new solution strategies, both for analytical and numerical work. 

Our method for constructing covariant field equations is to find a covariant 
Lagrangian and vary this. The resulting equations are then guaranteed to be 
covariant. Our first task, then, is to find covariant forms of the field strengths 
for the gravitational gauge fields. From these we can construct covariant scalar 
quantities, which can act as a Lagrangian density. 


13.5.1 The rotation-gauge field strength 


The field strength for the Q(a) connection is found in the standard way by 
considering commutators of covariant derivatives. We define 


[Da Doly = Ruth, (13.175) 
so that 
Rav = bpo — Opa +O, XQ). (13.176) 
A frame-free notation is introduced by first writing 
Rav = R(e, ^ep), (13.177) 


where the {e,,} vectors are the coordinate frame defined by the x. We can 
therefore write 


R(aAb) = a-VA(b) — b-VA(a) + Qa) x Q(b). (13.178) 


Whenever we adopt this notation we assume that the vector arguments a and 
b are constant. Since the right-hand side is antisymmetric on a and b, the field 
strength depends only on the bivector aAb. This linear action on bivector blades 
is extended to general bivectors by defining 


R(aAb + cAd) = R(a ^b) + R(cAd). (13.179) 
This means that we can write the field strength as 
R(B) = R(B; x), (13.180) 


which is a position-dependent, linear function of the bivector B. The field 
strength is a general bivector, as there are no restrictions on the form of Q(a). 
This means that R(aAb) has 36 degrees of freedom, as opposed to the rather 
simpler six of electromagnetism. 

Unlike the electromagnetic case of equation (13.109), the commutator term 
Q(a) x Q(b) has not cancelled out. This has an important consequence for the 
field equations — they are no longer linear. If we add together two configurations 
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of Q(a), the field strength of the resultant Q(a) is not the same as that from the 
superposition of the original field strengths. This makes the gravitational field 
equations much more difficult to solve than those of electromagnetism. 

The definition of R(B) in terms of commutators makes it easy to establish its 
transformation properties under rotation gauge transformations. We see that 


[Di D, |b" = §R'(e,Aev) Ry = RID, Dy) = §RR(epAev)y, (13.181) 


from which we can read off that 


R’(B) = RR(B)R. (13.182) 


Unlike electromagnetism, the field strength now transforms under gauge trans- 
formations, albeit in a straightforward way. 

Under displacements, Q(a) transforms as defined in equation (13.153). It 
follows that the field strength transforms to 


R’(e,Aev) = O40’ (er) — ONV (eu) +V (ev) xO" (ex) 
= flen) VÔ (f(e); x’) — f(e,) Vor Q(F(e,)5 x’) +O (e,) xX (ev) 
+ 2(Opf (ev) — OF (en); 2’) 
= R(f(e,Ae,); 2’) + Q(3 fler) — OF (en); 2’). (13.183) 
But we know that 
Opf (én) — Opflen) = On0,F (1) — vð f(x) = 9, (13.184) 
so the field strength has the simple displacement transformation law 
R(B) + R'(B) = R(f(B);2’). (13.185) 


We see that R’(B) picks up a term in f(B) under displacements, so is not fully 
covariant. To form a covariant tensor we insert a term in h(a) into R(B) and 
define the covariant field strength 


R(B) = R(h(B)). (13.186) 


The factor of h(B) in this definition alters the transformation properties un- 
der rotations. Since h transforms according to equation (13.155), the adjoint 
transforms as 


h(a) + h’(a) = æ laRh(b) R) = h(RaR). (13.187) 
The transformation properties of R(B) are therefore summarised by: 


displacements: R'(B,x) = R(B, x"), 


rotations: R'(B) = RR(RBR)R. (13.188) 


These are precisely the properties we require, and they define a covariant tensor. 
The rotation law may look complicated, but it is quite natural. For example, 
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suppose that R(B) simply amounts to the instruction ‘dilate all fields by the 
factor a’. This is a physical statement, so ought to be true in all gauges. The 
original statement corresponds to 


R(B) = aB. (13.189) 
The transformed field is then 
R'(B) = RR(RBR)R = R(aRBR)R = aB, (13.190) 


so does contain to the same physical information. The function R(B) plays 
the same role in the gauge theory approach as the curvature tensor in general 
relativity, so we refer to R(B) as the Riemann tensor. We continue to employ 
the notational device of writing covariant tensors in calligraphic symbols to help 
keep track of which objects are gauge-invariant. 


13.5.2 The displacement-gauge field strength 


The displacement gauge field couples to the vector derivative to form the object 
h(V). This coupling is different to that of the connection for the rotation gauge 
field, and we cannot use the commutator of covariant derivatives to obtain the 
field strength. Indeed, the precise definition and meaning of the field strength 
for the displacement gauge are unclear. Here we motivate a definition that has 
the desired properties and is physically plausible. 

The main property we require of a field strength is that it should vanish if the 
field is obtained by a pure gauge transformation. If we start with the identity 
and apply a displacement, the induced h field is given by 
h(a) = f~ (a). (13.191) 
One of the properties satisfied by a pure displacement is that 

VAf(a) = 0. (13.192) 
So h will define a pure gauge transformation if it satisfies 
VAh} (a) = 0, (13.193) 


where we temporarily ignore the rotation gauge. The left-hand side is our can- 
didate object for the field strength. The task now is to make it covariant. 

We know that the vector derivative V picks up a factor of h to convert it to 
covariant form. Since h~! transforms in the same way as V, we can define a 
displacement-gauge covariant object H(a) as 


H(a) = —h(VAh7!(a)) = h(W) Ahh (a). (13.194) 
This is a bivector-valued function of its vector argument. The final step is to 
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convert the derivative to one that is covariant under rotations. This is straight- 
forward since h transforms as a vector under rotations. We therefore define 


H(a) = h(24)A (b- VRA (a) +.2(0)-a), (13.195) 
or, in terms of a coordinate frame, 
Hg") = 9° \(Dagu) = DAG", (13.196) 


where we have applied that result that VAe" = 0. 
The tensor (a) is covariant under displacements and rotations, so transforms 

covariantly as 
displacements: H' (a,x) = Ha, 2’), 


rotations: #H'(a) = RH(RaR)R. (13.197) 


As we will soon see, the object we have defined is in fact the torsion tensor, a 
bivector-valued function of a vector with 6 x 4 = 24 degrees of freedom. This 
is the appropriate number for the field strength of the displacement gauge, as a 
displacement is specified by four degrees of freedom. In the simplest formulation 
of the field equations, the torsion is equated with the spin of the matter. It 
is therefore a pure contact term, and usually extremely small. One can justify 
this on dimensional grounds. The two field strengths we have defined, H(a) 
and R(B), differ in dimensions by a factor of length. This is because Q(a) has 
dimensions of (length)~!, whereas h(a) is dimensionless. The only fundamental 
length scale that could relate these is the Planck length, lp, which is tiny. The 
natural scale for S(a) is therefore lp times R(B), making it negligible compared 
to the Riemann tensor. 


13.5.3 The gravitational action 


We have now defined two covariant tensors from the gravitational gauge fields 
— the Riemann and torsion tensors. We next require a scalar term to act as the 
Lagrangian density for gravitation. There are a number of quadratic scalars we 
can derive from the gauge fields, but only one scalar is linear in the field strength. 
This is important, as one can again argue on dimensional grounds that higher 
order terms should be reduced by factors of the Planck length. 

We first define the contractions of the Riemann tensor. The first is the Ricci 
tensor: 


R(b) = 3a R(a nb). (13.198) 


By construction, this is a tensor. The Ricci tensor can be contracted further to 
defined the Ricci scalar 


R = ða R(a). (13.199) 


We use the same symbol to denote the Riemann tensor, Ricci tensor and Ricci 
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scalar, and distinguish between these by their argument. The Ricci scalar is a 
covariant scalar field, so is invariant under rotations and transforms covariantly 
under displacements. The Ricci scalar is the first scalar observable we have 
constructed from the gravitational fields, and is the simplest candidate for the 
Lagrangian density. We therefore suppose that the overall action integral is of 
the form 


S= / |d x| det (h)-*($R+A— KL), (13.200) 
where Lm describes the matter content and k = 87G. We have also included the 
cosmological constant A, though for most applications we set this to zero. The 
independent dynamical variables are h(a) and Q(a), and we assume that Lm con- 
tains no second-order derivatives, so that h(a) and Q(a) appear undifferentiated 
in the matter Lagrangian. 


The h field is undifferentiated in the entire action, as we have not included 
any terms in H(a). The Euler-Lagrange equation for h is simply 


ka) (det (h) (R/2 +A —KLm)) = 0. (13.201) 
Employing the results of section 11.1.2 we find that 
Oka) det (h)~* = —det (h)~*h7*(a) (13.202) 
and 


Ona) R z OF(a) (h(O-Adp)R(BAc)) 


= 2h(d,)-R(bAa). (13.203) 
It follows that 
O(a) (Raet (h)-*) = 2G (h= (a) )det (h)-}, (13.204) 
where G is the Einstein tensor, 
Gla) = R(a) — aR. (13.205) 


We now define the functional matter energy-momentum tensor T (a) by 
det (h)Ojcay(Lmdet (h)~+) = T(h-¥(a)). (13.206) 
We therefore arrive at the first of our field equations, 
G(a) — Aa = KT (a). (13.207) 


This is the gauge theory statement of Einstein’s equation. The source term 
in the Einstein equations in the functional energy-momentum tensor, not the 
canonical one. The form of this is discussed once we have found the remaining 
field equations for the rotation gauge field. 
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The Euler-Lagrange field equation from Q(a) is, after multiplying through by 
det (h), 


aR a JR 
J 


aL 
ana) t are | 58,0(@) 


det W=) = LETO (13.208) 


where we have employed the assumption that Q(a) does not contain any cou- 
pling to matter through its derivatives, and have temporarily reverted to an 
orthonormal coordinate system. The right-hand side defines the matter spin 
tensor 


OLm 


S(a) = aa)" (13.209) 
This has the covariant form 
S(a) = S(h-*(a)), (13.210) 
which is a covariant tensor. For the left-hand side we use the results 
oa) (h(OaAOe)Q(c) x Q(d)) = 20(d) x h(O Aa) 
and 
ORON (h(ða^ 3e) (e VAA) — d-VQ(c))) = 2h(a ^7”). (13.211) 
Combining these results, equation (13.208) becomes 
h(V)Ah(a) + det (h)d,.(h("*)det (h)-*) Ah(a) 
+ Q(b)xh(æ ^a) = KS(a). (13.212) 


Recalling the definitions of H(a) and t, from equations (13.195) and (13.158) 
respectively, the second field equation has the covariant form 


H(a) + t^a = KS(a). (13.213) 


So, as stated, H is governed by the matter spin density. 
The second field equation (13.213) simplifies further once we form the con- 
traction of the torsion tensor H(a). This is 


a Hla) = Dyhl) — AV) hyp) A(O). (13.214) 
But we can now use 


h= (yu): (Avh(y")) = ((AH(y0)) Ah Ag2A73)I"det (h)~*) ++ 
= det (h)~'d,det (h), (13.215) 


to write 


dqH(a) = det (h)D, (det (h)~*h(“)) = t. (13.216) 
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So the vector t which appeared in the Dirac equation is the contraction of the 
torsion tensor. On contracting equation (13.213) we find that 


—2t = K0q-S(a), (13.217) 


which directly relates t to the matter spin density. The second field equation 
can now be written as 


H(a) = KS(a) + $4()-S(d)) Aa. (13.218) 


This equation directly relates the torsion to the matter spin density. 


13.5.4 The matter content 


To illustrate the structure of the source terms we return to the covariant Maxwell 
and Dirac Lagrangian densities. First consider free-field electromagnetism. Un- 
der displacements, the vector potential A transforms as a cotangent vector (1- 
form): 


A(x) + A'(x) = f(A(z’)), (13.219) 
and the field strength F transforms as a 2-form: 
F => F(a) = VAA (x) = f(F(2’)). (13.220) 
The covariant field strength is therefore defined by 
F=h(F)=h(VAA), (13.221) 
and the covariant Lagrangian density for the electromagnetic field is 
Lem = 4F-F. (13.222) 
The functional energy-momentum tensor is defined by 
Tem (h=! (a)) = det (h) Oia) (FF -Faet (h)-*) 
= h(a-F)-F —h7"(a). (13.223) 
So we obtain 
Tem(a) = (a-F)-F — a = -4 FaF. (13.224) 


This is precisely the form we would expect for the covariant generalisation of 
the electromagnetic field strength. Unlike the canonical definition, there is no 
issue about the tensor being electromagnetic gauge-invariant, and the tensor is 
automatically symmetric. Furthermore, there is no coupling to Q(a), so the 
electromagnetic spin density is zero. We will discover in section 13.6 that, if 
the spin tensor is zero, the functional energy-momentum tensor must also be 
symmetric. 

As an example of a field with non-vanishing spin density we next consider the 
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Dirac theory. With the electromagnetic coupling included, the covariant action is 
defined by equation (13.150). The functional energy-momentum tensor is simply 


T(a)= (a-g"Dybly3d) — ea: Awyoth. (13.225) 


This is manifestly a covariant tensor, though it is not necessarily symmetric. 
The spin density is 


Sp(a) = 4h(a)- (vrat) (13.226) 
or, covariantly, 
Sp(a) = ha: (pIa) = 14.8, (13.227) 


where S is the spin trivector. In the limit where gravitational interactions are 
turned off, the functional definitions agree with the canonical energy-momentum 
and angular momentum tensors. 

The form of the Dirac spin has an important consequence. If we form the 
contraction we find that 


20a:S(a) = ða: (a- S) = 0, (13.228) 


so the torsion vector t vanishes. This is reassuring, as it implies that the 
minimally-coupled Dirac action produces the minimally-coupled Dirac equation 
on variation. Equation (13.228) is satisfied by scalar, Dirac and Yang-Mills 
fields. An exception is provided by a vector field that is often introduced to 
ensure local dilation invariance. There are good reasons for introducing such a 
field, though any interactions it might generate are likely to be on the scale of 
quantum gravity and are not discussed here. 

As a further example of a source field for gravitation, we consider the case 
of an ideal fluid. This is the simplest form of matter energy-momentum tensor 
one can consider, and generates an important class of models. The action for an 
ideal fluid was introduced in section 12.4.2, and the only modification required to 
convert to a covariant action is multiplication of the energy density by det (h)~?: 


S= [ate (-det (h)-te + J-(VA) — pJ-Vn). (13.229) 


The Lagrange multiplier terms are both unaffected by the presence of a gravita- 
tional field. The covariant current density is 


J = det (h)h71(J) = pv, (13.230) 


where v? = 1 (see section 13.5.6). The energy density £ therefore depends on 
the h field through its dependence on p. We find that 


Oia)” = 2p? (h(a) = h-1(a)-vv), (13.231) 
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so the functional stress-energy tensor is 
ðe 
T(a) = —p(a — a-v “a, + ae. (13.232) 
p 
Recalling the definition of the pressure from equation (12.158), we are left with 
T(a) = —(a — av v)(e + P)+ ae 
= (e + P)a-vv — Pa. (13.233) 
This is precisely the form we expect, with v now a covariant vector satisfying the 


constraint v? = 1. The actual form of v is gauge-dependent, a fact we can exploit 
to our advantage in applications by choosing a gauge where v has a simple form. 


13.5.5 The torsion-free equations and general relativity 


For many applications the matter spin density is negligible. It is a quantum 
effect, and the macroscopic spin of an object is usually extremely small as all of 
the individual constituents cancel out. In the case where the spin can be ignored 
the second field equation becomes 


H(a) = 0. (13.234) 
If we replace a by a general cotangent vector A, this equation can be written 
DAh(A) = h(VAA), (13.235) 


which is extremely useful in practice. This equation says that antisymmetrised 
partial and covariant derivatives produce the same result. We will now establish 
that the spinless gauge field equations are (locally) equivalent to those of general 
relativity. Many of the relevant equations for Riemannian geometry were derived 
in section 6.5.5. 

To begin, we define the connection by 


Dagu = Viiv9a (13.236) 
so that 
Tà = gò (Dug). (13.237) 


It follows that the directional covariant derivative of a vector A = A*g,, has 
components 
Dy A =Di(A“ ga) 
= (pA) ga + AT ads 
= (3 A“ + 14,4") ga, (13.238) 


which recovers the general relativistic expression. 
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If we recall from equation (13.143) that the metric is given by guv = gy-gv, We 
can now write 


dugva = (Dugv)-9r + gv: (Duga), (13.239) 

so that 
bugva =T iugar +P Gav- (13.240) 
This is the metric compatibility condition for the connection. The second im- 


portant condition on the connection, for pure general relativity, is antisymmetry. 
This follows from the torsion-free condition, since 


0 = (gu Agu): (PAG") = gu (Prg?) — g (Dug°) 
= g% (Dugu — Dvgy)- (13.241) 
We can therefore read off that 
Dp gu — Dyg = 0. (13.242) 
It follows that, in the absence of torsion, 
Pe T2, =0. (13.243) 


This equation and equation (13.240) together define the Christoffel connection. 
The equations can be inverted to recover the connection in terms of derivatives 
of the metric. Rather than reproduce the standard derivation at this point, we 
will instead demonstrate how to invert equation (13.234) to find Q(a) in terms 
of the h field. 
Returning to the definition of the H (a) and H(a) tensors of equations (13.194) 
and (13.195), the absence of torsion tells us that 
—H(a) = h() A (Q(b)-a). (13.244) 
At this point it is useful to introduce the displacement-gauge-covariant connec- 
tion 
w(a) = Q(h(a)). (13.245) 
Under displacements this transforms covariantly, 


w'(a; x) = wl asa’). (13.246) 


Under rotations the transformation law for w(a) is somewhat more complicated 
than that for Q(a), so it is usually preferable to deal with the latter when dis- 
cussing rotation-gauge transformations. Equation (13.244) now becomes 


^ (w(b)-a) = -H (a), (13.247) 
which gives w(a) in terms of h and its derivatives. To solve this we first compute 


a NA (w(b)-a) = 2p Aw(b) = —O,\H (b). (13.248) 
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Now, taking the inner product with a again, we obtain 
w(a) — OA(a-w(b)) = — 4a- (3s AH (b)). (13.249) 
We can therefore write 
w(a) = H(a) — ża- (^H (b)), (13.250) 


which enables us to compute w(a) directly. In the presence of spin an additional 
term built from the spin tensor is added to the right-hand side. One can now 
convert the solution for w(a) into a set of Christoffel coefficients, if desired. One 
disadvantage of the latter is that they mix up gauge terms with terms induced by 
a choice of curvilinear coordinates. From the manifold viewpoint this is sensible, 
but it is less natural in the gauge theory context. 

Next we turn to the form of the Riemann tensor in general relativity. In terms 
of the connection, this is 


Rivo = On ee — ð, Pe +7? T° SpE pe 


=0 mtg” (D a= dv (97 (D9) = (Dag) (Dogo) F (Dig? )-(Dugp) 
= 9° (D Dugo — DLD pago); (13.251) 


from which we can read off that 
Rup? = R(guAgv)-(gpAg9")- (13.252) 


This converts directly between the gauge theory and tensor formulations of grav- 
ity. One can also check that the contractions defined earlier are all equivalent to 
their general relativistic counterparts, so the gauge theory equation (13.207), in 
the torsion-free case, has the same content as the Einstein equations. The main 
differences between the two theories are topological in nature, and one can argue 
that such considerations are beyond the scope of the (local) theory of general 
relativity anyway. 


13.5.6 Currents and Killing vectors 


The gauge theory we have constructed is founded on an action principle in a flat 
spacetime. It follows that Noether’s theorem still holds, and that symmetries 
of the action result in a conserved vector current J. Every such vector has a 
corresponding covariant equivalent. To find this we first write 


V-J =IVA(LJ) = 0, (13.253) 
so, assuming no torsion is present, we have 
h((VA(J)) = DAR(LJ) = 0. (13.254) 
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We can therefore write 
IJ =h(IJ) = Ih} (J)det (h), (13.255) 


which defines the covariant current J in terms of J. The covariant vector J 
then satisfies 


D-J =0. (13.256) 


There is a vector J conjugate to each continuous symmetry of the action. If we 
attempt to find conserved vectors conjugate to translations and rotations, how- 
ever, we do not discover any new information. In both cases the conjugate tensor 
turns out to be zero once the field equations are employed. This is due to the 
manner of the coupling of the h field. Variation with respect to h can be viewed as 
defining the total energy-momentum tensor, and this is zero because there is no 
derivative term for the h field in the action. It is traditional, of course, to single 
out (minus) the gravitational contribution to the total energy-momentum ten- 
sor (the Einstein tensor), and then equate this to the matter energy-momentum 
tensor. 

A covariantly-conserved vector J gives rise to a conserved scalar because it 
can always be converted back to a non-covariant vector J satisfying V.J = 0. The 
same is not true of covariant conservation of a tensor, such as G(a). Tensors only 
give rise to useful conserved quantities in the presence of additional symmetries of 
the Lagrangian. This is the case when the h field is independent of the derivative 
along a global vector field. In this case one can construct a coordinate system 
such that the metric guv is independent of one of the coordinates. If we call this 


x°, we have 


0 
gzl = gu: (go: Dgr) + gv:(Ggo-Poy) = 9. (13.257) 


But, for a coordinate frame in the absence of torsion, equation (13.242) holds 
and we have 
Iu: (Gv DK) + gv: (gu: DK) = 0, (13.258) 
where K = go is the covariant Killing vector. In coordinate-free form we can 
write 
a:(b-DK) + b-(a-DK) = 0 (13.259) 
for any two vector fields a and b. This can be used as an alternative definition 


for a Killing vector. Contracting with a:p immediately tells us that K is 
divergenceless. 


13.5.7 Point particle motion 


General relativity typically models observers as point particles following geodesic 
paths, as defined by the geodesic equation. But the gauge approach has dealt 
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solely with the properties of classical and quantum fields. To complete the proof 
of the equivalence of the gauge approach and general relativity, we must recover 
the geodesic equation from the minimally-coupled Dirac equation. In coordinate 
form, the geodesic equation is 


ÙH + vv Th = 0, (13.260) 


where v“ = « and the overdots denote the derivative with respect to proper 
time. This is defined such that 


Juv” = 1. (13.261) 
To convert to covariant form we introduce the vector 
v = vug” = h7} (2), wool, (13.262) 


This is a covariant vector, though for aesthetic reasons we do not write this in a 
calligraphic font. The derivative with respect to proper time is 


0, = £#d, = v-h(V). (13.263) 
The geodesic equation (13.260) can be now be written 
0,0 — vrgu + VP (Dogg) = ù + w(v)-v = 0. (13.264) 
The gauge theory form of the the geodesic equation is therefore 
uv: Dv =v+u(v)-v=0. (13.265) 


This equation is also recovered by finding the paths that minimise the proper 
time interval 


S= O ane, (13.266) 


Geodesics are classified into timelike, lightlike or spacelike according to the value 
of v?, which can be +1, 0 or —1 respectively. Point particles with mass follow 
timelike geodesics. 

The process by which classical paths are recovered from Dirac theory is dis- 
cussed in section 12.2.1. The essential term in the action is the kinetic one, which 
we manipulate in the same way to write 


det (h)~!(DyIy3) = det (h)~ (7 DyIogy™ t), (13.267) 


where J = wow. Equation (13.255) relates the covariant current J to the 
divergenceless current J. The classical limit is formed by concentrating the den- 
sity onto a single streamline of J and ignoring terms in the action perpendicular 
to the flow. The action therefore contains the term 


det (h)~' ( 7-9" Dywlosb—*) = (Y + EUa yoy). (13.268) 
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Separating out the rotor dependence, as before, and converting to proper time 
derivatives, the equations of motion are 


v-DS + 2pAv =0 (13.269) 
and 
v-Dp=0. (13.270) 


Here v = h“! (x) = RyR and S = RIo3R. Classical point-particle motion is 
recovered by setting the spin to zero, so that p and v are aligned, and fixing 
p:v =m. In this case we recover precisely the geodesic equation. 

This derivation is unusual, but it is important for two reasons. The geodesic 
equation tells us that point particles follow the same paths regardless of their 
mass and so implies the equivalence of gravitational and inertial mass. This is the 
weak equivalence principle, a fundamental ingredient in general relativity. From 
the gauge theory perspective, the weak equivalence principle is derived from the 
classical limit of the Dirac equation. The only principle invoked in constructing 
the covariant Dirac equation was minimal coupling, so at one level this has the 
consequence of enforcing the weak equivalence principle. One can also argue 
that minimal coupling is the essence of the full equivalence principle, which tells 
us how physics should appear locally to a freely-falling observer. The second 
important feature of this derivation is that it points out the limitations of the 
weak equivalence principle. Both the wave nature of matter and the existence 
of quantum spin ensure that the geodesic equation is an approximation, and 
there are many quantum effects in gravitational backgrounds (such as black hole 
absorption) where the particle mass is important. 

If a Killing vector is present, equation (13.259) tells us that 


v-(v-DK) =0. (13.271) 
So, for a particle satisfying the geodesic equation, we find that 
0,(u:-K) = v-D(v- K) = K- (v-Dv) + v-(v-DK) =0. (13.272) 


It follows that the quantity v-K is conserved along the worldline of a freely- 
falling particle. For stationary matter configurations, this can be used to define 
the conserved energy of the particle. 


13.5.8 Electromagnetism in a gravitational background 


The electromagnetic vector potential A ensures that the Dirac equation is co- 
variant under local phase transformations. In equation (13.222) we found that 
the covariant action integral for the electromagnetic field in a gravitational back- 
ground is given by 


S= | |atel(act hl LF-F, (13.273) 
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where 
F =h(F). (13.274) 


The field strength F is covariant under local translations and rotations, as well 
as being phase-invariant. 

We can include a source term by adding an A- J term, where J is a covariant 
vector. For example, when coupling to a fermion J is given by the Dirac current 
wyow. The full action integral is therefore 


S= f |ate|(act h) TI FF + A-J). (13.275) 


To find the field equations for electromagnetism we vary this integral with respect 
to the underlying dynamical variable A, with h and J treated as external fields. 
The result is the equation 


V-(hh(VAA)det (h)~*) = J, (13.276) 

where 
J = det (h)~'h(7). (13.277) 
Equation (13.276) combines with the identity VA F = 0 to form the full set 


of Maxwell equations in a gravitational background. Some insight into these 
equations is provided by performing a spacetime split and writing 


E+cIB=F, 


B : (13.278) 
D +IH/c= eohh(F)det (h)~!, 


where we have temporarily included the factors of c and € 9. In terms of these 
variables Maxwell’s equations can be written in the familiar forms 


V-B=0, V-D=p, 
3B aD (13.279) 
VAE +I =0, g + VUE) = -J, 


where Jy = p+ J. These forms of the equations illustrate how the det (h)~'hh 
is a generalized permittivity /permeability tensor, defining the properties of the 
space through which the electromagnetic field propagates. For example, the 
bending of light by the sun can be easily understood in terms of the properties 
of the dielectric defined by the h field exterior to it. 

So far, however, we have failed to achieve a covariant form of the Maxwell 
equations. We have, furthermore, failed to unite the separate equations into a 
single equation. To find a covariant equation, we simplify matters by ignoring 
torsion effects, so that we can write 


DAF =h(VAF) =0. (13.280) 
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Next, we use a double-duality transformation to write the left-hand side of equa- 
tion (13.276) as 


V-(h(F)det (h)~') = IVA(Ih(F)det (h)~*) 
= IVA(h71(IF)) 
= Ih7*(DA(IF)). (13.281) 
Equation (13.276) now becomes 
D-F=f, (13.282) 


and equations (13.280) and (13.282) combine into the single covariant equation 
DF = J. (13.283) 


This achieves our objective. Equation (13.283) is manifestly covariant and gen- 
eralises the free-field Maxwell equations to a gravitational background in an ob- 
vious and natural manner. In the presence of torsion an additional term appears 
in the covariant expression of the Maxwell equations. But in such circumstances 
the spin fields generating the torsion are likely to interact strongly with the 
electromagnetic field and swamp most interesting gravitational effects. 


13.6 The structure of the Riemann tensor 


The Riemann tensor R(B) contains a remarkable amount of algebraic structure, 
much of which is hidden in the tensor calculus approach. Again, we assume that 
there is no torsion present, so that the second field equation reduces to (13.234). 
Writing A = h(A) we have 


DAA =h(VAA), (13.284) 
so 
DA(DAA) = h(VAVAA) = 0. (13.285) 
It follows that 


g! N(Dy(g’ A(DvA))) = 9" Ag” A (Du DuA) 
= 59" Ag" A([Dy, Pr A) 
= gg’ (Ru XA). (13.286) 


So, for any multivector M, 
Oa /\Oy/\(R(aAb) x M) = 0, (13.287) 
which is a covariant equation. 
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To analyse equation (13.287) further we set M equal to the vector c, and 
protract with ôe to form 


Ic Oa NA (R(aAb) xc) = —20aAOAR(aNb) = 0. (13.288) 

Now forming the inner product with c we obtain 
20a NR(aAc) + aAA (R(a^b) xc) = 0, (13.289) 

so that we are are left with the compact identity 
OaNR(aNb) = 0. (13.290) 


This summarises all of the symmetries of R(B) in the case of zero torsion. 
Equation (13.290) says that the trivector 0, A R(a A b) vanishes for all values of 
the vector b, so gives a set of 4 x 4 = 16 equations. These reduce the number 
of independent degrees of freedom in R(B) from 36 to 20, the expected number 
for general relativity. Contracting equation (13.290) we obtain 


Dp: (OaNR(aNb)) = IaNR(a) = 0, (13.291) 
which shows that the Ricci tensor R(a) is symmetric. The same is therefore 
true of the Einstein tensor. In the absence of any spin-torsion interactions, 
the matter energy-momentum tensor must also be symmetric, as is the case 


for electromagnetism and the relativistic fluid. The covariant Riemann tensor 
satisfies the further useful identities, 


I-A (a-R(cAb)) = R(aAd), 


(B-a) R(a^b) = —Oq B-R(aNb). (32292) 
It follows that 
^ ((B-0q)-R(aAb)) = —2R(B) = —OAda(BR(aND)). (13.293) 
The Riemann tensor is therefore also symmetric, 
By, -R(B2) = B2- R(B1). (13.294) 


That is, R(B) = R(B). 


13.6.1 The Weyl tensor 


The structure of the Riemann tensor is more clearly seen by separating out the 
matter content, as contained in the Ricci tensor. Since the contraction of R(aAb) 
results in the Ricci tensor R(a), we expect that R(aAb) will contain a term in 
R(a)Ab. This must be matched with a term in aAR(b), since it is only the sum 
of these that is a function of aAb. Contracting this sum we obtain 


Ja (R(a)Ab + aAR(b)) = BR — R(b) + 4R(b) — R(b) 
= 2R(b) + oR, (13.295) 
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and it follows that 
da: (4 (R(a) ^b + aAR(d)) = LaNbR) = R(b). (13.296) 
We can therefore write 


R(aAb) = W(aAb) + 3(R(a)Ab+ aAR(d)) — $a ^bR, (13.297) 


where W(B) is the Weyl tensor. 
From its definition the Weyl tensor must satisfy 


Oa: W(aAb) = 0. (13.298) 
As the Ricci tensor is symmetric, we also have 
da (5 (R(a)Ab + aAR(b)) — ta ^bR) =0, (13.299) 
so the Weyl tensor also satisfies 
3a AW(a) = 0. (13.300) 
Equations (13.298) and (13.300) combine into the single equation 
OaW(aNb) = 0. (13.301) 


This compact equation is unique to the geometric algebra formulation, as it 
involves the geometric product. To study the consequences of equation (13.301) 
it is useful to introduce the {7,,} frame and write the four equations for b equalling 
each of the +, vectors as 


aiW(o1) + o2W(a2) + o3W(o3 
o1W(o1) — IagW(lo2) — Io3W (Ios 


? 


>) 


0 
0 
0 
0 


—Io,W(Io1) + o2W(02) — lo3W(Io3) = 0, 80) 
—Io,W(Io1) — Iag2W(Io2) + o3W(03) = 0. 
Summing the final three equations, and employing the first, produces 
lo,.W(Io,) = 0. (13.303) 
Substituting this into each of the final three equations produces 
Wo.) = IW(exr), (13.304) 
and it follows that the Weyl tensor satisfies 
W(IB) = IW(B). (13.305) 


This says that the Weyl tensor is self-dual. In the two-spinor formalism of 
Penrose and Rindler the duality of the Weyl tensor is expressed in terms of a 
complex formulation. The spacetime algebra shows that this complex structure 
arises geometrically through the properties of the pseudoscalar. 
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Given the self-duality of the Weyl tensor, the remaining content of equa- 
tion (13.302) is summarised by 


o.W(or) = 0. (13.306) 


This equation says that, viewed as a three-dimensional complex linear function, 
W(B) is symmetric and traceless. This gives W(B) five complex, or ten real 
degrees of freedom. The gauge-invariant information is held in the complex 
eigenvalues of W(B), since these are invariant under rotations. As these must 
sum to zero, only two are independent. This leaves a set of four real intrinsic 
scalar quantities. 

Overall, R(B) has 20 degrees of freedom, six of which are contained in the 
freedom to perform arbitrary local rotations. Of the remaining 14 physical de- 
grees of freedom, four are contained in the two complex eigenvalues of W(B), 
and a further four in the real eigenvalues of the matter stress-energy tensor. 
The six remaining physical degrees of freedom determine the rotation between 
the frame that diagonalises G(a) and the frame that diagonalises W(B). This 
identification of the physical degrees of freedom contained in R(B) is physically 
very revealing and extremely useful in guiding solution strategies. 


13.6.2 The Bianchi identities 


Further information about the Riemann tensor is contained in the Bianchi iden- 
tities. These follow immediately from the Jacobi identity in the form 


(Da, [D8, Dy]|A + cyclic permutations = 0. (13.307) 


It follows that 
DaRgy + cyclic permutations = 0, (13.308) 


which we need to express as a fully covariant relation. We start by forming the 
adjoint relation, 


Ig NOpNOc((a:VR(bAc) + Q(a) x R(bAc))B) = 0, (13.309) 
which simplifies to 
VAR(B) — ôa AR(Q(a) x B) = 0, (13.310) 


where B is a constant bivector. To make further progress we again assume that 
the torsion vanishes. The Riemann tensor is then symmetric, so 


R(B) = h~'Rh(B) = h7!R(B). (13.311) 
We can therefore write 


VA(h-*R(B)) — OgAh*R(Q(a) x B) = 0. (13.312) 
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Now acting on this equation with h and using equation (13.235), we establish 
the covariant result 


DAR(B) — 3a AR(wla)x B) = 0. (13.313) 


This result takes a more natural form when B becomes an arbitrary function of 
position, and we write the Bianchi identity as 


a^ (a-DR(B) — R(a-DB)) = 0. (13.314) 
We can extend the overdot notation of section 11.1 in the natural manner to 
write equation (13.314) as 
DAR(B) =0. (13.315) 
This is a highly compact, elegant expression of the Bianchi identity, though it is 
often easier to use the more explicit form of equation (13.314). 
The contracted Bianchi identity is obtained from 
(3a ^3): (DAR (anb)) = 3a: (R(aND) + DR(a)) 
= 2R(D) — DR, (13.316) 
from which we find 
G(D) =0. (13.317) 
The adjoint form of this equation is sometimes more useful: 
D-G(a) = D-G(a) — æ- G(b-D a) = 0. (13.318) 


This is the covariant expression of conservation of the Einstein tensor. It follows 
that the total matter energy-momentum tensor must satisfy the same relation. 
With the gravitational interaction turned off, the free-field (or flat-space) energy- 
momentum tensor must be symmetric and divergence-free. This is the case for 
the functional electromagnetic and fluid energy-momentum tensors. This is not 
true of the Dirac theory, where the presence of spin alters many of the preceding 
results and distorts much of the elegant structure of pure general relativity. 

The covariant conservation equation (13.318) does not give rise to conserved 
vector currents, and hence conserved scalars, unless a further symmetry is present 
in the gravitational fields. In this case one can construct a Killing vector K 
satisfying equation (13.259). This is sufficient to prove that 


G(0q):(aDK) = 0, (13.319) 
which holds because G(a) is a symmetric tensor. It follows that 
D-(G(K)) = D-G(K) — da-G(a-DK) = 0, (13.320) 


which yields a covariantly conserved vector. This can be converted to a spacetime 
current and hence to a conserved scalar quantity. 
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13.7 Notes 


Lagrangian field theory is discussed in many textbooks, particularly those that 
go on to treat quantum field theory. The texts by Itzykson & Zuber (1980) and 
Bjorken & Drell (1964) are again recommended, as are the book by Cheng & 
Li (1984) and the set of lecture notes by Coleman (1985). The history of gauge 
theories in the twentieth century is described in the set of collected papers edited 
by Taylor (2001). The use of the multivector derivative in analysing field La- 
grangians was introduced in the paper by Lasenby, Doran & Gull (1993a), and 
further refinements are contained in the thesis by Doran (1994). 

The discovery that gravity could be treated as a gauge theory was made ini- 
tially by Utiyama (1956) and Kibble (1961). An attempt at a quantum treatment 
along the lines suggested by Kibble was made by Feynman and is contained in 
the Feynman Lectures on Gravitation (Feynman, Morningo & Wagner, 1995). 
The application of spacetime algebra in the context of classical general relativity 
was promoted by Hestenes in the book Space-Time Algebra (1966) and the pa- 
per ‘Curvature calculations with spacetime algebra’ (1986). Many other authors 
have followed this route and a considerable literature now exists on applications 
of Clifford algebra in general relativity. Rather than attempt to list all of these, 
and run the risk of offending anyone we miss out, we recommend searching the 
main pre-print archives on the keyword ‘Clifford’. 

The particular combination of the gauge treatment of gravity and the space- 
time algebra developed here was first presented in full in the paper ‘Gravity, 
gauge theories and geometric algebra’, by Lasenby, Doran & Gull (1998). This 
contains an extensive list of references and we refer the reader there for further 
material. The form of the field equations in the presence of torsion is discussed 
in Doran et al.(1998). Readers of these papers, and the preceding chapter, will 
notice that the notation and conventions for this subject have not yet settled 
down. We believe that this chapter represents an advance over previous work, 
but doubtless there is still room for improvement. While it has not been em- 
ployed in this chapter, we do recommend the underbar/overbar notation for 
linear functions in hand-written work. This helps keep track of the form of 
various objects, and avoids the problem of using different fonts to distinguish 
objects. Unfortunately, this notation tends to look too cluttered when typeset, 
which is why underbars are not employed in this book. 


13.8 Exercises 


13.1 The physical energy-momentum tensor for free-field electromagnetism is 
defined by 


Tem(a) = — 4 FaF. 


SYMMETRY AND GAUGE THEORY 


Prove that each of Tem(2), Tem(a), Tem(xaxv) and Tem(B-ax) is con- 
served. How many independent conserved constants can one construct 
from these? How does this relate to the dimension of the spacetime 
conformal group? 

Prove that, in a space of dimension n, 


1+ xa 
V =0 
(a +2r-a + r] i 


where a is an arbitrary vector. 
The field w satisfies the minimally-coupled Dirac equation. Prove that 


V- (YỌ) = 2eA-(ur2), 
V- (pyb) = —2e A- (py). 


Can you derive these relations from a transformation applied to the 
Dirac Lagrangian? 
The coupled Maxwell-Dirac Lagrangian is defined by 


L = (VIn — eAdyod — mpd). 


Find the canonical energy-momentum tensor. Prove that £ is unchanged 


in form by the transformations 
ylz) > Ria’), A(z) => RA(e')R, 


where z’ = RxR and R is a constant rotor. Find the conserved tensor 
conjugate to this transformation. 

The gravitational field strength is defined in terms of the bivector con- 
nection Q, by 


Rav = OpQv — Opu +H Qux Qr. 
Verify that this vanishes if 
Q, = -—20,RR, 


where R is a spacetime rotor. 
Prove that, for non-vanishing spin, the w(a) field is given by 


o(a) = H(a) — 5a-(A)AH(b)) + Sla) — Ž ra: (3,S(0)). 


Prove that, in the case of zero torsion, timelike paths which minimise 
the proper time 


s= fart? 


satisfy the geodesic equation v-Dv = 0, where v = h7!(#) and v? = 1. 
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Gravitation 


In this chapter we explore the content of the gravitational field equations derived 
in section 13.5. In covariant notation these equations are 


G(a) — Aa = KT (a), 


H(a) = KS(a) + 4K (d)-S(b)) Aa, (14.1) 


where x = 87G, A is the cosmological constant, G(a) and H(a) denote the 
Einstein and torsion tensors, and the matter sources are determined by the 
total energy-momentum tensor T(a) and the spin tensor S(a). Locally, the field 
equations define an Einstein—Cartan theory of gravitation. 

We start this chapter with a discussion of the various strategies we can adopt 
for solving the field equations. In particular, we focus on a new technique that 
is unique to the gauge theory approach. Of course, the physical content of the 
equations does not depend on the method of solution. But the field equations 
have proved so resistant to analysis that it is important to have a wide range 
of analytical approaches at our disposal. Most of the applications of interest do 
not involve macroscopic spin, so the torsion is set to zero. The only exception is 
when we consider self-consistent cosmological models for a single spinor field in 
a gravitational background. 

As a first application of our solution method we study spherically-symmetric, 
time-dependent systems. This setup is sufficiently general to use for studying 
non-rotating stars and black holes, and also cosmology. We study the properties 
of both classical and quantum matter in these backgrounds, looking in detail 
at scattering and absorption processes around a black hole. We then turn to 
static cylindrical systems. These are of limited astrophysical interest, but they 
do demonstrate some important features of our solution method. In particu- 
lar we find that, for certain matter distributions, the gravitational fields admit 
closed timelike curves. These matter distributions can give rise to violations of 
causality, which are therefore not ruled out by the theory without further as- 
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sumptions. We end this chapter with a discussion of axially-symmetric fields and 
the Kerr solution. We give a novel derivation of the Kerr solution, which exposes 
a remarkable algebraic structure hidden in other approaches. We also describe 
a version of the Kerr solution that illustrates many of its physical features in a 
straightforward manner. 


14.1 Solving the field equations 


The traditional approach to solving the gravitational field equations in general 
relativity is to start with the metric g,,,. In equation (13.143) we showed that 
the metric is recovered from the h(a) gauge field by setting 


Juv = Gu’ Gv = h+ (ep) h7 (ep), (14.2) 


where the {e,,} comprise a coordinate frame. The metric g,,, is invariant under 
rotation-gauge transformations, so working in terms of the metric removes this 
gauge freedom from the outset. The result is that the field equations become a 
set of non-linear, second-order differential equations for the terms in g,,. Any 
metric is potentially a solution of the field equations — one where the matter 
energy-momentum tensor is determined by the corresponding Einstein tensor. 
But this is seldom useful, as what is required is a solution for a given matter 
distribution. This is an extremely difficult problem. 

A related shortcoming of the metric approach is that it is extremely difficult 
to set up a consistent perturbative scheme. The problem is that the metric is 
gauge-dependent, so it is not apparent which quantities can be treated as small. 
This can only be defined consistently in terms of covariant scalars, as these are 
the only gauge-invariant quantities. Clearly, then, we should aim to solve the 
equations directly in terms of these quantities. Such a method is described here, 
and applied to a range of problems in this chapter. 

We start by focusing on objects that transform covariantly under displace- 
ments. For ease of reference we call these intrinsic objects. Unlike the metric 
formulation, the class of intrinsic objects in the gauge treatment extends beyond 
scalars to include general multivectors and functions. For example, each of h(V), 
w(a) and R(B) are intrinsic objects. The task is to formulate the field equations 
directly in terms of these objects. We assume that the spin is negligible, so that 
the second field equation in (14.1) states that the torsion is zero. The method we 
describe here can therefore be directly applied to problems in general relativity. 

The torsion equation relates derivatives of h(a) to the w(a) field, where w(a) 
is defined in equation (13.245). The torsion equation can be written as 


h(V)Ah(c) = —Oa/A (w(d)-h(c)), (14.3) 
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which we contract with a^b to form 


(bAah(Y)Ah(c)) = —(bAa da (w(d)-h(c))) 
= (a-w(b) — b-w(a))-h(c). (14.4) 


The essential operator on the left-hand side is the directional derivative a-h(V). 
This turns out to be the key operator in our approach, and we write this as 


La =a-h(V). (14.5) 


The use of La for this operator should not be confused with the quantum- 
mechanical angular momentum operators, though their properties are analysed 
in a similar way. In terms of La the torsion equation becomes 


(Lah(b) — Lyh(a))-¢ = (a-w(b) — b-w(a)) (c), (14.6) 


where, as usual, the overdots determine the scope of a differential operator. 

The information contained in the torsion equation is summarised neatly in 
terms of the commutator bracket of the La operators. We find that the commu- 
tator of La and Lẹ is 


|La, Lo] = (Lah(b) — Leh(a))-V 
= (Lah(b) — Lyh(a))-V + (Lab — Lya)-h(V) 
= (a-w(b) — b-w(a) + Lab — Lya)-h(V). (14.7) 
We can therefore write 
[La, Le] <= Le, (14.8) 
where 
c=a-w(b) — b-w(a) + Lab — Lra = a-Db—b-Da. (14.9) 


This bracket structure summarises the intrinsic content of the torsion equation 
in a very convenient manner. If spin is present, the right-hand side of equa- 
tion (14.9) is modified in a straightforward way to include spin-dependent terms. 

The key to our strategy is that we delay any explicit solution for w(a) until 
after further gauge fixing has been performed. Instead, we let w(a) take on a 
suitably general form, consistent with the form of the h function. This is often 
best achieved with the aid of a symbolic algebra package, though it is possible, if 
tedious, to perform the calculations by hand. Once a general form for w(a) has 
been found, the relationship between h(a) and w(a) is then encoded intrinsically 
in the commutation relations of the La- 

The next object to form is the Riemann tensor R(B). This is constructed 
in terms of abstract first-order derivatives of the w(a) and additional quadratic 
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terms. We see this by writing 


R(aAb) = LaQ(h(a)) — LyQ(h(a)) + w(a) xw(b) 
= Law(b) — Lyw(a) + wla) x 


so that we have 
Rab) = Law(b) — Lyw(a) + w(a) xw(b) — w(c), (14.11) 


where c is given by equation (14.9). Equation (14.11) enables R(B) to be cal- 
culated entirely in terms of intrinsic quantities. Once the general form of the 
Riemann tensor is found, we can start to employ the rotation-gauge freedom to 
convert R(B) to a suitably simple expression. This gauge fixing is crucial in 
order to arrive at a set of equations that are not underconstrained. The gauge 
fixing is now performed directly at the level of the covariant variables. This gives 
the method great power, as one can motivate gauge choices on sensible physical 
grounds, rather than blind guesswork at the level of the metric. 

With R(B) suitably fixed, we arrive at a set of relations between first-order 
abstract derivatives of the w(a), quadratic terms in w(a) and matter terms. The 
next step is to impose the Bianchi identities, which ensure overall consistency of 
the equations with the bracket structure. Once all this is achieved, one arrives 
at a fully intrinsic set of equations. Solving these equations usually involves 
searching for natural integrating factors. The final step is to make an explicit 
position gauge choice of the h function. The natural way to do this is often to 
ensure that the form of h(a) is such that the integrating factors are expressed 
simply in terms of the chosen coordinates. This description is quite abstract, but 
in the following sections we apply this scheme to a range of physical problems. 
These should illustrate how the scheme is applied in practice. We start with the 
simplest case of spherically-symmetric, torsion-free systems. 


14.2 Spherically-symmetric systems 


To solve the field equations for spherically-symmetric systems, we first introduce 
the standard polar coordinates (t,r,6,¢). In terms of the fixed {y} frame we 
write 


os 
t = £0, cos(8) = nits ; 
- (14.12) 
2 wN 
r = Vero), tan(o) = T. 
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The associated coordinate frame is 


et = 0, 
er = sin(0)(cos(¢) yı + sin(¢) y2) + cos(8) 73, (14.13) 
eg = r cos(8) (cos() yı + 2 $) 72) — rsin() 73, 
ey = rsin(9)(—sin() 71 + cos(¢) 72), 
and we will also make use of the unit vectors Ô and db defined by 
(ee $= — (14.14) 
= et oh = A P 
r” sin(@) €$ 
From these we define the unit bivectors 
Or = Cr€t, O9 = bez, T$ = Qer. (14.15) 


For applications in gravity there is little reason to write these spatial bivectors 
in bold face, so we break the convention adopted earlier in this book and leave 
the unit bivectors in ordinary face. We work throughout in natural units c = 
hi = G = 1, so that & = 8r, and in the first instance we set the cosmological 
constant to zero. 


14.2.1 The spherical equations 


Our first step towards a solution is to decide a suitable form for the h function 
consistent with spherical symmetry. The form we use is 


h(e*) = fie’ + fre”, 

h Py aes Tie r 

ae i (14.16) 
h(e’) = ae’, 

h(e®) = ae®, 


where fı, f2, 91, 92 and a are all functions of t and r only. The only rotation- 
gauge freedom in this system is the freedom to perform a boost in the øy di- 
rection. This freedom will be employed later to simplify the equations. Our 
remaining position-gauge freedom lies in the freedom to reparameterise t and r, 
which does not affect the general form of h(a). A natural parameterisation will 
emerge once the physical variables have been identified. 

To find a general form w(a) consistent with the h function of equation (14.16), 
we substitute the latter into equation (13.250) and look at the general algebraic 
form of w(a). Where the coefficients in w(a) contain derivatives of terms from 
h(a) new symbols are introduced. Undifferentiated terms from h(a) appearing 
in w(a) arise from frame derivatives and are left in explicitly. The result is that 
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Or To To 
eD 0 Glog —Glog 
er: D 0 Flog —F log 


ÊD Fog—Slog To, Slo, 
$D Tos+Sloo -SIor -To 


Table 14.1 Covariant derivatives of the polar-frame unit timelike bivec- 
tors. 


we can write 


w(et) = Geret, 
w(e,) = Feret, 
are ' (14.17) 
w(0) = She: + (T — a/r)erð, 
w($) = Ser + (T — a/r)erd, 


where G, F, S and T are functions of t and r only. The important feature of 
these functions is that they transform covariantly under displacements of r and t. 
To define a suitable bracket structure we first introduce the operators 


Li = &-h(V), L; = 6-h(V), 
= eR) a= O-K9) nad 
L, = e,-h(V), L; = o-W(V) 
Equation (14.8), together with our form for w(a), yields the relations 
(Li, Lr] = GLi ai FL,, [L,, Lg] = -T Lẹ, 
|Le, La] = -S Lg, [L,, La] = -TL;, (14.19) 
[L:, La] = —SL, [Lo La] = 0. 


A set of bracket relations such as these is the first step in writing the field equa- 
tions in an entirely intrinsic form. The use of orthonormal vectors in expressing 
these relations brings out the structure most clearly. 

Next we seek an intrinsic form of the Riemann tensor. This calculation is 
simplified by making use of the results in table 14.1. The bracket relations 
enable us to calculate the derivatives of a/r by writing 


Lila/r) = L,L40 = (Le, Lg] = —Sa/r, 


(14.20) 
Lr (a/r) = LpLg@ = (Le, £40 = —Ta/r. 


Application of equation (14.11) is now straightforward, and leads to the Riemann 
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tensor 
Rig;) = (L-G — LiF + G = F*)o;, 
R(o9) = (-LiS + GT — S*)og + (LiT + ST — SG)Io¢, 
R(og) = (LS + GT — $?)o4 — (LiT + ST — SG)Io¢, (14.21) 
R(Iog) = (LrT +T? — FS)Iog — (L,S + ST — FT)oo, 
R(Ioo) = (LiT + T? — FS)Io9 + (L-S + ST — FT)o¢, 
R(Io,) = (9 + T? — (a/r) Ior. 


We must next decide on the form of matter energy-momentum tensor that the 
gravitational fields couple to. We assume that the matter is modelled by an ideal 
fluid, as discussed in section 13.5.4, so we can write 


T(a) =(p+p)a-vv— pa, (14.22) 


where p is the energy density, p is the pressure and v is the covariant fluid velocity 
(v? = 1). Radial symmetry means that v can only lie in the e; and e, directions, 
so v must take the form 


v = cosh(x) e¢ + sinh(y) er. (14.23) 


But, in restricting the h function to the form of equation (14.16), we retained 
the gauge freedom to perform arbitrary radial boosts. This freedom can now be 
employed to set v = ez, so that the matter energy-momentum tensor becomes 


T(a) = (p+ p)a-et et — pa. (14.24) 


There is no physical content in the choice v = e; as all physical relations must 
be independent of gauge choices. The choice simply fixes the rotation gauge in 
such a way that the energy-momentum tensor takes on the simplest form. This 
removes all rotation-gauge freedom — an essential step in the solution method, 
since all non-physical degrees of freedom must be removed before one can achieve 
a complete set of physical equations. 

In section 13.6.1 we saw how to decompose the Riemann tensor into a source 
term and the Weyl tensor. The source term can be written 


ar 


R(anb) — W(a^b) = 3 3aNT (b) +3T(a)^b — 2T anb), (14.25) 


where 7 = 0,:T(a) is the trace of the matter energy-momentum tensor. With 
T (a) given by equation (14.24), R(B) is restricted to the form 


R(B) = W(B) + = (3(0 + p)Bee e, — 2pB). (14.26) 


Comparing this with equation (14.21) we see that the Weyl tensor must have 
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the general form 


Wor) = 107, W(Ior) = aslor, 
W(c9) = a209 + b1I0g, W(Ioo) = asIog + b209, (14.27) 
W(a¢) = Q204 — Bi loo, W(a¢) = aglog — b209. 


Here each of the a; represents a combination of intrinsic objects. 

The torsionless gravitational field equations ensure that the Weyl tensor is self- 
dual and symmetric. The former implies that a, = a4, a2 = a3 and 3) = — bə, 
and the latter implies that 6; = 2. It follows that 6, = G2 = 0. Finally, W(B) 
must be traceless, which requires that a, + 2a2 = 0. Taken together, these 
conditions reduce W(B) to the form 


W(B) = “(B +30, Bor). (14.28) 
This is of Petrov type D. From the form of R(Ic,) we can see that 


8 2 
o = L Bae ey ie = (14.29) 


If we now define @ by 
a2 
48 = —-S?+T?-—, (14.30) 
£E 


then the full Riemann tensor can be written as 
4T 


3 (3(p + p)B-ete: —2pB). (14.31) 


2 
R(B) = (4 $ =») (B + 30,Bo,) + 
We compare this with equation (14.21) to obtain the following set of equations: 


L,S = 28 + GT — S? —4np, 

L.T = S(G — T), 

L,S=T(F-— 85), (14.32) 

L,T = —28 + FS — T? —4Anp, 

L,G — LiF = F? — G? +48 + 4r(p + p). 
We are now close to our goal of a complete set of intrinsic equations. The re- 

maining step is to enforce the Bianchi identities. The only identity that contains 
new information in our present setup is the contracted Bianchi identity defined in 


section 13.6.2, which guarantees covariant conservation of the energy-momentum 
tensor. For an ideal fluid this results in the pair of equations 


D- (pv) +pD-v =0, 


(p + p)(v-Dv) Av — (Dp) Av = 0. (14.33) 
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The quantity (v-Dv)Av is the covariant acceleration bivector, so the second equa- 
tion relates the acceleration to the pressure gradient. For the case of spherically- 
symmetric fields, these equations reduce to 

Lip = —(F + 28)(p + p), 

L,p = —G(p +p). 
The latter of these identifies G as the radial acceleration. The full Bianchi iden- 


tities now turn out to be satisfied as a consequence of the contracted identities 
and the bracket relation 


(14.34) 


Leb) = GLi — Fr (14.35) 


This completes our derivation of the intrinsic equations. The full set is defined 
by equations (14.20), (14.32), the contracted identities (14.34) and the bracket 
structure of equation (14.35). The equation structure is closed, as the bracket 
relation (14.35) is consistent with the known derivatives. The derivation of such 
a set of equations is the basic aim of our method. The equations deal solely 
with objects that transform covariantly under displacements, and many of these 
quantities have direct physical significance. 


14.2.2 Solving the spherical equations 


To solve the intrinsic equation structure we first form the derivatives of 8 to 
obtain 
Lib +3958 = 27 Sp, 


(14.36) 
L-8 +3T8 = -27T p. 


These results suggest that we should look for an integrating factor for the L4 + S 
and L, + T operators. Such a function, X say, should have the properties that 


LX = SX, L,X=TX. (14.37) 


A function with these properties can exist only if the derivatives are consistent 
with the bracket relation of equation (14.35). This is checked by forming 


[Li, L] X = Li(TX) a L,(SX) 


= X(L,T — L,S) 
= X(SG — FT) 
= GLX = FL,X, (14.38) 


which confirms that the properties of X are consistent with the bracket structure. 
In fact, we can see from equation (14.20) that r/a has the desired properties. 
Integrating factors of this type often arise as natural, intrinsically-defined coor- 
dinates, and the form of the solution is usually simplest when expressed directly 
in terms of these. Since the position-gauge freedom in the r direction has not yet 
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been fixed, it is natural to set a = 1, so that r plays the role of the integrating 
factor directly. We will confirm shortly that this gauge choice ensures that r is 
a physically meaningful quantity. 

With the radial scale fixed by setting a = 1, we can now make some further 
simplifications. From the form of the h function in equation (14.16), together 
with equation (14.37), we see that 

gı = Lyr = Tr, 


14.39 
g2 = Lir = Sr. ( ) 


This replaces two functions in the bivector connection in favour of terms in h(a). 
We also define 


Tr 
M = —2r?8 = 9 (92" —g? +1), (14.40) 


which satisfies 
LiM = —4nr gop, 


14.41 
L,M = 4nr’gip. ( ) 


The latter suggests that M plays the role of an intrinsic mass. 

So far we have defined the natural distance scale, but have not yet found a 
natural time coordinate. Such a coordinate is required to complete the solution, 
so we now look for additional criteria to motivate this choice. We are currently 
free to perform an arbitrary r and t-dependent displacement along the e; direc- 
tion. This gives us complete freedom in the choice of fo function. If we now 
invert equation (14.41) to find the coordinate derivatives of M we obtain 


OM _ —Ang,gor7(p + p) 


ot fig. = fg (14.42) 
ƏM £ 4rr?(figıp + f292p) 
Or fig — f292 


The second equation reduces to a simple classical relation if we choose fə = 0, 
as we then obtain 


0,M = 4rr°p. (14.43) 
This says that, at constant time t, M (r,t) is determined by the amount of mass- 
energy in a sphere of radius r. 


With f2 set to zero we can now use the bracket structure to solve for fı. We 
have 


Li = fis + 920r, = Lr = 910+, (14.44) 
so the bracket relation of equation (14.35) implies that 
L, f, =-Gfi. (14.45) 
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It follows that 


fi = e(t) exp (- / i ae as) (14.46) 


The function e(t) can be absorbed by a further t-dependent rescaling along ez, 
which will not reintroduce a term in fo. In the fo = 0 gauge we can therefore 
reduce to a system in which 


fi = exp (- " GU) as) l (14.47) 


gı(s) 

The physical explanation for why the fz = 0 gauge is a very natural one to 
work in emerges when we set the pressure to zero. In this case equation (14.34) 
forces G to be zero, and equation (14.47) then sets fı = 1. A (free-falling) 
particle comoving with the fluid has covariant velocity v = e+, so the trajectory 
of this particle is defined by 


te, + re, = h(ez) = et + 92 €r, (14.48) 


where the dots denote differentiation with respect to the proper time. Since 
i = 1 the time coordinate t matches the proper time of all observers comoving 
with the fluid. In this sense, the time coordinate that has emerged behaves like a 
global Newtonian time on which all observers can agree (provided all clocks are 
correlated initially). By employing the various gauge choices outlined above, and 
casting the dynamics in terms of the t coordinate, we are ensuring that (when 
p = 0) the physics is formulated from the viewpoint of freely-falling observers. 
We then expect that the gravitational equations should take on a clear, physical 
form, which is indeed the case. 
As a further illustration of this point, it is clear from (14.48) that g2 represents 
a radial velocity for the particle. In the absence of pressure, the rate of change 
of mass is given by 
3M = —4rr? gop. (14.49) 


This equation equates the work with the rate of flow of energy density. Similarly, 
equation (14.40), written in the form 


a 2 (la 1), (14.50) 


is also now familiar from Newtonian physics — it is a Bernoulli equation for 
zero pressure and total (non-relativistic) energy (g1? — 1)/2. When pressure is 
included, the purely Newtonian interpretation starts to break down, due mainly 


to the fact that pressure can act as a source of gravitation. But it remains the 
case that the gauge choices described here pick out what appears to be the most 
natural set of equations for studying spherically-symmetric systems. 

The system of equations we have now derived is summarised in table 14.2. We 
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h(e*) = fie! 
a h(e”) = gie” + gee’ 
The h field hle?) ue 
h(e®) = e? 
w(er) = Geer 
The w field ea ae 
e w fie A A 6 
w(0) = go/r ber + (gı — 1)/r e0 
w(b) = g2/r ber + (gı — 1)/r erĝ 
Directional derivatives = fd: + 920r 
Ly = gi0r 
Ligi = Gg2 
Equations for G and F Lrg2 = Fqi 
fı = exp{ f" —G/gi ds} 
Definition of M M = }r(g2? — gi? +1) 


oe ae Lige = Gg. — M/r? — 4rrp 
Remaining derivatives Lrgi = Fg2 + M/r? — 4rrp 
LiM = —4rr? gop 

L,M = Arr? gip 

Lip = —(292/r + F)(p + p) 
Lyp = —G(p + p) 


Matter derivatives 


Riemann tensor yeaa pt p)B-ever — 2p/3 B) 
2 


(M/r? — 4rp/3)(B + 30,Ba;) 


Energy-momentum tensor T(a) = (p+ p)a-erer — pa 


Table 14.2 Gravitational equations governing a radially-symmetric perfect 
fluid. An equation of state and initial data p(r,to) and g2(r, to) determine 
the future evolution of the system. 


refer to this system as defining the Newtonian gauge, since so many equations 
take on an almost Newtonian form. Of course, this should not distract from 
the fact that we have solved the full, relativistic gravitational field equations. 
The system of equations in table 14.2 underlies a wide range of phenomena 
in relativistic astrophysics and cosmology. One aspect of these equations is 
immediately apparent. Given an equation of state p = p(p), and initial data in 
the form of the density p(r,to) and the velocity go(r,to), the future evolution 
of the system is fully determined. This is because p determines p and M ona 
time slice, and the definition of M determines gı. The equations for Lpp, L,g1 
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and L,g2 then determine the remaining information on the time slice. Finally, 
the L,M and Lig2 equations can be used to update the information to the next 
time slice, and the process can start again. The equations can therefore be 
implemented numerically as a simple set of first-order update equations. This is 
important for a wide range of applications. 


14.2.8 Static matter distributions 


As a first application of the equations governing a spherically-symmetric system, 
we consider a static matter distribution. This solution is appropriate for a non- 
rotating spherical source. The density and pressure are now functions of r only. 
The mass is given by 


M(r) = J Ars” p(s) ds (14.51) 
0 
and it follows that 
LM = Anr?gop = —Anr’ gop. (14.52) 


For any physical matter distribution p and p must both be positive, in which 
case equation (14.52) can only be satisfied if g2 vanishes. It follows that F = 0 
as well, so for static, extended objects we have 


g=F=0. (14.53) 
Since gp is zero, gı is given simply in terms of M (r) by 


joie (14.54) 


r 


For this to hold we require that 2M (r) < r. This condition says that a horizon 
has not formed anywhere in the object. 
The remaining equation of use is that for D4g2, which now gives 


M(r) 


Gq = + 4rrp. (14.55) 


pe 
Equations (14.54) and (14.55) combine with that for Lyp to produce the Oppen- 
heimer—Volkov equation 

Op __(p+p)(M(r) + 4ar*p) 


Or r(r —2M(r)) eee) 


This is the force balance equation appropriate for a relativistic matter distribu- 
tion. The line element generated by our solution is 


| a r 


d R rM 


dr? — r? d0? — r° sin? (0) d¢”, (14.57) 
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where fı is given by equation (14.47). The solution extends straightforwardly to 
the region outside the star. In this region M is constant, and 


fi =1/g. =(1-2M/r)71?. (14.58) 


We therefore recover the Schwarzschild line element. This is the solution used 
for some of the most famous tests of general relativity, including those for the 
bending of light and the perihelion precession of Mercury. Clearly, the gauge 
theory framework does not alter any of these results. 


14.3 Schwarzschild black holes 


Perhaps the most famous solution of the Einstein equations (apart from Lorent- 
zian spacetime) is the Schwarzschild solution for a black hole. This solution 
describes the gravitational fields surrounding a point source of matter, of total 
gravitational mass M. One form of this solution is described by the line element 
of equation (14.57) for the case of constant M. But this is ill defined at r = 2M 
which, as we shall soon discover, defines an event horizon. This tells us that our 
gauge choice has not yielded a satisfactory global solution, so we must return to 
the field equations to discover what went wrong. 

For a point source located at the origin we have p = p = 0 everywhere away 
from the source. The matter equations therefore reduce to 


LiM = L,-M =0, (14.59) 
which tells us that the mass M is constant. The remaining equations simplify to 
Ligi = G92, 

Lrg = Fg, (14.60) 

g? — go" =1—2M/r. 
No further equations yield new information, so we have an underdetermined 
system of equations. Despite all of the gauge-fixing steps taken to arrive at the 
set of equations summarised in table 14.2, for vacuum fields some additional 


gauge fixing is still required. The reason for this is that, in the vacuum region, 
the Riemann tensor reduces to 


M 


This tensor is now invariant under boosts in the ø, plane, whereas previously the 


R(B) = 


presence of the fluid velocity in the Riemann tensor vector broke this symmetry. 
The appearance of this new symmetry in the matter-free case manifests itself as 
a new freedom in the choice of the h function. 

Given this new freedom, we can look for a choice of gı and gə which simplifies 
the equations. If we attempt to reproduce the Schwarzschild line element we have 
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to set g2 = 0, but then we immediately run into difficulties with gı, which is not 
defined for r < 2M. We must therefore look for an alternative gauge choice. A 
suitable candidate, motivated by the pressure-free equations, is provided by the 
simple choice 


g=l. (14.62) 
It follows that 
fi=1, g2 =-vy2M/r (14.63) 
and 
M M\12 
= F=- = | — 14.64 
Get, gor (5) (HGA) 


In this gauge the h function has the remarkably simple form 


h(a) = a—./2M/r a-er er. (14.65) 


This only differs from the identity through a single term. The line element 
obtained from this gauge choice is 


2 
2 2 2M \*? 2102 | an2 2 
ds“ = dt" — | dr + Ea dt |} —r°(dé" + sin“ (0) dọ°), (14.66) 


which is regular at the horizon (r = 2M) and covers all spacetime down to 
r = 0. This form of the line element was first derived by Painlevé and Gullstrand, 
not long after Schwarzschild’s original work was published. Despite the many 
advantages of this form of the solution, it has not been routinely employed in 
solving physical problems. 

The h field of equation (14.65) is the form of the Schwarzschild solution we will 
use for studying the properties of spherically-symmetric black holes. Of course, 
all physical predictions must be independent of gauge, but this only reinforces 
the point that we should always endeavour to work in a gauge that simplifies the 
analysis as far as possible. The results for the extension to the action of h to an 
arbitrary multivector A are useful in what follows. We find that 


h(A) = A — ./2M/r(A-e,) ^et. (14.67) 


It follows that det (h) = 1 and the inverse of the adjoint function, as defined by 
equation (4.152), is given by 


h-1(A) = A + V2M/r(A-e:)^er. (14.68) 


It is straightforward to verify that this function recovers the line element of 
equation (14.66). 
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14.3.1 Point particle trajectories 
The motion of a classical point particle in free fall is governed by the geodesic 
equation 
v:-Dv = ù + w(v) v =0. (14.69) 
The mass m of the particle is unimportant (provided m « M), and is set to 


unity throughout this section. Since G = 0 in our chosen gauge, we immediately 
see that 


w(er) = 0. (14.70) 


It follows that v = e, is a solution of the geodesic equation. The trajectory this 
defines has 


& = h(v) = h(ez) = et + uer, (14.71) 
where 
u= ô = —/(2M/r). (14.72) 


Particles, or observers, following the geodesic defined by v = e; fall in radially 
with velocity 7 given by the familiar Newtonian formula. Furthermore, we see 
that t = 1, so the time coordinate t is precisely the time measured by these 
infalling observers. This is, in part, why the gauge choice we have adopted turns 
out to simplify many calculations. 

Now consider a more general trajectory, with covariant velocity 


v =te, + (ir/2M/r + re, + ĝeo + deg. (14.73) 

Since the h function is independent of t we have, from equation (13.272), 
ht (e) v = (1 — 2M/r)i — *\/2M/r = constant. (14.74) 

So, for particles moving forwards in time ($ > 0 for r > 00), we can write 
(1-2M/r)i=a+7/2M/r, (14.75) 


where the constant œ satisfies a > 0. The radial equation is found from the 
constraint that v? = 1, which gives 


+? = a? — (1 — 2M/r) (1 +12(6? + sin?(6) é°)). (14.76) 
Spherical symmetry implies that the angular velocity J is also conserved, where 
J? = r*(6? + sin? (0) $°). (14.77) 


The motion of a particle around a black hole is therefore determined by the 
single radial equation 


t? =@? — (1-2M/r) (1 + ) (14.78) 
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This equation is gauge-invariant, as it relates local quantities. The radial coor- 
dinate r is defined locally by the magnitude of the Riemann tensor, and the dots 
denote the derivative with respect to (local) proper time. This transition from 
global to local variables is in keeping with the gauging process. The motion of 
a particle in spacetime is obtained by integrating equations (14.78) and (14.75). 
At the horizon we have 7 = —a, so there is no pole in equation (14.75), and the 
equations can be integrated down to the singularity. 
Differentiating equation (14.78) we obtain 


M J 3MJ? 
p= + l (14.79) 


r3 ri 


The equivalent three-dimensional vector equation is 


M 3MJ? 
=- ( + ) ĉ. (14.80) 


r2 r4 


This equation was analysed perturbatively in section 3.3.1. For stable orbits the 
main new effect introduced by relativity is a small perturbation of the eccentricity 
vector. The content of equation (14.78) can similarly be summarised in the radial 
effective potential (per unit mass) 


M P 2M 
Veg = pH (1 i (14.81) 


We then have 


2 =] 2 
$ >= 5 +V (14.82) 


which identifies ma as the conserved relativistic energy of the particle. Bound 
states have a < 1 and scattering states have a > 1. 

The effective potential differs from the Newtonian expression in the factor 
of (1 — 2M/r) multiplying the centrifugal term. This has little effect at large 
distances, but dramatically alters the small-r behaviour. Inside r = 2M the 
centrifugal term in the effective potential changes sign and becomes attractive. 
There is no longer any term in the potential applying an effective outward force, 
and the particle must inexorably move towards the central singularity. One can 
see this clearly in equation (14.73). Inside the horizon the velocity must be 
2 — 1 to remain satisfied. Once inside the horizon, no 
particle can escape the singularity, no matter what force is applied to attempt 
to counteract the gravitational pull. Eventually, the tidal forces (defined by the 
Riemann tensor) become so large that all objects are pulled apart into their 
constituent particles. 


negative in order for v 
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14.3.2 Photon trajectories 


A full treatment of the properties of electromagnetic waves in a gravitational 
background involves solving the gravitationally-coupled Maxwell equations of 
section 13.5.8. For a range of practical problems it is sufficient to ignore the 
detailed properties of the electromagnetic field, and work in the geometric optics 
limit. In this approach, photons are treated as massless (scalar) point particles. 
These particles follow null trajectories with 


k =h! (t), k? =0. (14.83) 


The trajectories are still specified by the equation k-Dk = 0. For radial infall 
we must have 


k = v(e: — er), (14.84) 
where v = k - e is the frequency measured by radially free-falling observers 


(at rest at infinity). The photon trajectory is independent of the frequency, as 
demanded by the equivalence principle. The path defined by k is given by 


& = h(k) = v(e, — (1 + V(2M/r))er). (14.85) 
It follows that 
Z L (1+ y(2M/n)). (14.86) 


This integrates straightforwardly to give the photon path. We have therefore 
found the path without employing the equation of motion. This is possible 
because we restricted to motion in a single spacetime plane. 

The equations of motion tell us how the frequency changes along the path. To 
find this we need 


MN!” 
w(k) = —v (5) Or, (14.87) 
from which we see that 
MN”? 
w= (5) è (14.88) 


This equation is more usefully expressed in terms of the derivative with respect 
to r. We use 


t = —v(1 + y(2M/r)) (14.89) 


to arrive at 
ld M 1 1 1 
a = i (14.90) 
vdr r 2M+x4(2Mr) 2r./r/rg +1 
where rg = 2M is the Schwarzschild radius. This equation can again be inte- 
grated straightforwardly to tell us how frequency v changes with radius. We see 
that nothing untoward happens until r = 0 is reached. 


514 


14.3 SCHWARZSCHILD BLACK HOLES 


We can repeat the previous analysis for outgoing photons. For this case we 
have 


k = v(e, + er) (14.91) 
and the path is 
è = h(v) = v(e; + (1 — y(2M/r))e,). (14.92) 
It follows that 
ar EOM (14.93) 


dt 

But now, when r < 2M the path is still inwards. Inside r = 2M, not even 
light can escape. The surface r = 2M is called the event horizon. It marks the 
boundary between two regions, one of which (the interior in this case) cannot 
signal to the other. We also find that 

ldv _ M 1 a 1 1 (14.94) 

vdr r 2M—,/(2Mr) 2r ./r/rg —1 
which is negative outside the horizon. So, as photons climb out of a gravitational 
field, they are redshifted. This is one of the best-tested predictions of general 
relativity. The redshift becomes increasingly large as the horizon is approached, 
so photons emitted from near the horizon are strongly redshifted as they climb 
out to infinity. The various features of radial motion in a black hole background 
are shown in figure 14.1. One conclusion from this plot is that, as seen by external 
observers, any object falling through the horizon appears to hover outside the 
horizon and just fade out of existence as the redshift increases. 

If any object collapses to within its event horizon, it must carry on collapsing 


to form a central singularity. There is no possible force capable of preventing the 
collapse. This is because matter is always constrained to follow timelike paths, 
and if the entire future light-cone points inwards towards the singularity, no mat- 
ter can escape. The object remaining at the end of this process is called a black 
hole. All paths for infalling matter terminate on the singularity. There has been 
much research into the properties of singularities, though their nature remains 
enigmatic. In one sense, gravitational singularities are no more difficult to deal 
with than singularities in the electromagnetic field due to point sources. They 
can also be analysed in much the same way using integral equations. But this 
(classical) treatment of singularities can only contain part of the story. Quantum 
mechanically, black holes have an associated entropy, implying the existence of 
a series of microstates consistent with the macroscopic properties of the hole. 
It is widely believed that a more complete understanding of quantum gravity 
should explain this phenomenon through a detailed quantum description of the 
singularity. 
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Figure 14.1 Matter and photon trajectories in a black hole background. 
The solid lines are photon trajectories, and the horizon lies at r = 2. 
Outside the horizon it is possible to send photons out to infinity, and hence 
communicate with the rest of the universe. As the emitter approaches 
the horizon, these photons are strongly redshifted and take a long time to 
escape. Once inside the horizon, all photon paths end on the singularity. 
The broken lines represent two possible trajectories for infalling matter. 
Trajectory I is for a particle released from rest at r = 4. Trajectory II is 
for a particle released from rest at r = oo. 


14.3.8 Stationary observers 


It is instructive to see how physics appears from the point of view of stationary 
observers in a Schwarzschild background. These observers have constant r, 6, ¢, 
so 
t= ter. (14.95) 
It follows that 
v = tle, + /(2M/r)e,). (14.96) 
But we require that v? = 1 for the path to be parameterised by the observer’s 


proper time, so 


ËQ- 2M/r)=1, #=(1-2M/r)-¥?. (14.97) 
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This is a constant, since r is fixed for these observers. We can see immediately 
that it is only possible to remain at rest outside the horizon. This is reasonable 
given the preceding considerations, though the picture is not quite so clear if the 
black hole is rotating. For this case there is a region outside the horizon within 
which it is impossible to remain at rest (though it is still possible to escape). 
The covariant acceleration bivector for a particle with velocity v is defined by 


(v-Dv)Av = bv +w(v)-v 0. (14.98) 


This gives the acceleration required to follow a given path. For stationary ob- 


servers we have 
M 


(v-Dv) Av = 70 -2M/yie 


(14.99) 
So an observer with mass m needs to apply force of Mm/r? x (1 —2M/r)~\/? to 
remain at rest. This is the Newtonian value multiplied by a relativistic correction 
term. This correction becomes increasingly large as the horizon is approached, 
as one would expect. 

We can now look at physics from point of view of these observers, which can be 
viewed as both being stationary and having constant acceleration. For example, 
if a second observer has velocity yo (so is in free fall), the relative velocity the 
two observers measure when their positions coincide is 


vAYo 


= y (2M/r)or. (14.100) 

v: Yo 
As we might expect, this is the Newtonian result. The only difference now lies 
in the interpretation of who is accelerating. The stationary observer is the one 
applying a force, so we now say that it is this observer that is accelerating. The 
observer in free fall is applying zero force, so is not accelerating. That is, we no 
longer view gravity as applying a force, as this would require a concept of what 
the particle would have done if the gravitational field were not present. Such a 
concept is not gauge-invariant, so is unphysical. 


14.3.4 Absorption and scattering 


The presence of the horizon implies that incident particles with total energy 
E > mc’? can suffer two fates. Either they will be scattered by the gravitational 
fields, or they will be absorbed onto the central singularity. The crucial quantity 
that determines the fate of the particle is the angular velocity J. In figure 14.2 we 
plot the effective potential of equation (14.81) for a range of angular velocities. 
If J is too small there is nothing to prevent the particle hitting the singularity. 
As J increases, the effective potential develops a barrier. If this barrier is greater 
than the total (non-relativistic) energy, the particle is no longer absorbed, and 
instead is scattered by the black hole. 
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Figure 14.2 The gravitational effective potential. The potential for a unit 
mass particle is defined by equation (14.81), and units are chosen so that 
the horizon lies at r = 2. The plots are for J values of 0, 4, 8, 16 and 
24. For small J nothing prevents the particle hitting the singularity. As 
J increases a barrier of increasing height is formed. If the particle has 
insufficient energy to surmount this barrier it is scattered. 


For a given energy, we can determine the critical value of J that distinguishes 
between absorption and scattering. This is most usefully encoded in terms of an 
impact parameter b, as illustrated in figure 14.3. Asymptotically, the incoming 
particle has angular velocity 


J = bř(o0). (14.101) 
But in this region the energy is determined entirely by 7, so the impact parameter 
is given by 
J2 
a 14.102 
az—1 z ( ) 


where a is the energy per unit mass of the incident particle, as defined in equa- 
tion (14.75). For a fixed energy, the critical value of J therefore determines the 
critical value of the impact parameter. From the point of view of absorption, 
the black hole then appears as a disc of radius b, and the total absorption cross 
section is defined by 


Cabs = 1b". (14.103) 
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Figure 14.3 The impact parameter. In the asymptotic incoming region, 
the impact parameter b measures the distance between the incoming tra- 
jectory and a parallel radial trajectory. For a black hole there is a critical 
value of b inside which all geodesics terminate on the singularity. The 
diagram also defines the scattering angle 0. 


This will be a decreasing function of energy — the faster the particle is travelling, 
the less likely it is to be absorbed. 

The algebra needed to compute the absorption cross section is straightforward, 
if a little tedious. First we write x = 1/r, so that the effective potential becomes 


b(a? -—1) > 


Veg = -Mz + x? (1 — 2M7). (14.104) 


The turning point is at 


1 12M2? \1/? 


To find b the equation we need to solve is therefore 


Weg (£e) = 07 — 1. (14.106) 
The solution then returns the absorption cross section 
M? 
Cabs = Tq (Buf + 20u? — 1 + (1 + 8u?)?/?), (14.107) 
u 


where we have expressed the result in terms of the velocity u: 


2 2 
2_ pP a-l 
The absorption cross section is plotted in figure 14.4. For small velocities we see 
that 
167M? 
oe 


(14.109) 


Tabs > 
uU 


As the incident velocity decreases, the absorption cross section increases, as is to 
be expected. As the velocity increases the absorption cross section tends towards 
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Figure 14.4 The classical absorption cross section. The cross section is a 
function of the incident velocity u (in units of c). As the velocity approaches 
the speed of light the cross section approaches the photon limit, as shown 
by the straight line. The vertical axis is in units of (GM/c?)?. 


the limiting result for a massless particle. For these the effective potential is 


simply 
2 2M 
Vef = Ee (1 — ) . (14.110) 


The turning point occurs at r = 3M, at which the effective potential has the 
value J?/54M?. Equating this with asymptotic energy J*/2b? we see that for 
photons b? = 27M?, and the photon absorption cross section is 


Cabs = Tb? = 270M”. (14.111) 


This is the limiting value of equation (14.107) as u — 1. In section 14.4.3 we 
study how these features are modified by a more complete, quantum treatment 
of the absorption process. 

Scattering presents a more difficult problem. The differential scattering cross 
section for a Newtonian 1/r potential is determined by the Rutherford formula 


do _ M? 
d 4u? sin4(6/2)’ 


(14.112) 


where @ is the scattering angle and u is the velocity of the incident particle. This 
formula relates the incident cross sectional area ø to the solid angle dQ, where 


do = 2nbdb, dQ = 27 sin(0) dé. (14.113) 
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The Rutherford cross section formula is easily computed from the properties of 
hyperbolic trajectories. The relativistic corrections to the Rutherford formula 
are generated by the additional r~? term in the potential. This term makes 
the problem considerably more difficult to solve, and no simple analytic formula 
exists for the classical scattering cross section. One problem is that it is now 
possible for particles to spiral around the centre before escaping. We could 
build up a perturbative picture of the scattering problem using the techniques 
described in section 3.3.1, though the resulting expressions are usually extremely 
complicated. A better approach to this problem is described in section 14.4.1, 
where the cross section is calculated using perturbative quantum theory. 


14.3.5 Electromagnetism in a black hole background 


Further insight into the nature and effects of a black hole is obtained by con- 
sidering the electromagnetic fields surrounding charges held at rest outside the 
horizon. The relevant equations were obtained in section 13.5.8. We assume that 
the charge is placed at a distance a > 2M along the z axis. The vector potential 
can be written in terms of a single scalar potential V(r, 6) as 


2Mr 
= ———e, ], 14.114 
A=V(r, 9) («+ Ae. ( ) 
so that 
OV 1 Ov. 
J= Sr eret — oM 30 Oles + 2M /re,) (14.115) 
and 
OV 1 OV. 
D = 3p er€t 2M O9 ber. (14.116) 
The Maxwell equations now reduce to the single partial differential equation 
10 / 0V 1 1 o oV 
i —)=- 14.11 
r? Or (« Or ) i r(r — 2M) sin(0) 30 (so) 30 ) fr ( 9 


where p = qô(x — a) is a 6-function at z = a. The solution (originally found by 
Linet) is 


(14.118) 


where 


1/2 


d= (r(r—2M) + (a— M)?’ —2(r— M)(a— M) cos(6) + M? cos?(0)) “°”. (14.119) 


When this result is substituted back into equation (14.115) we see that the 
covariant field F is both finite and continuous at the horizon. It follows that we 
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Figure 14.5 Streamlines of the electric field in a black hole background. 
The horizon lies at r = 2 and the charge is placed on the z axis. In the 


left-hand diagram the charge is held at z = 3, and in right-hand diagram 
it is at z = 2.1. 


have found a global solution to the electromagnetic field equations, appropriate 
both inside and outside the horizon. 

One way to illustrate the global properties of F is to plot the streamlines of 
D. Equation (13.279) ensures that these streamlines begin and end on charges, 
so for our case of a single isolated charge they should therefore spread out from 
the charge and cover all space. Furthermore, since the distance scale r was 
chosen to agree with the gravitationally-defined distance, the streamlines of D 
convey genuine intrinsic information. The plots therefore encode gauge-invariant 
information about the electromagnetic field. Figure 14.5 shows streamline plots 
for charges held at different distances above the horizon. A polarisation charge 
is clearly visible at the origin, and streamlines are attracted towards this but 
never actually meet it. The effects of the polarisation charge can be felt outside 
the horizon as a repulsive force acting back on the charge. That is, less force 
is required to keep a charge at rest outside a black hole than is required for an 
uncharged particle. The fact that the origin of this effect lies inside the horizon 
reinforces the importance of constructing global solutions to the field equations. 


14.3.6 Other gauges 


Before proceeding it is useful to study the vacuum spherical equations in an 
arbitrary gauge. We return to the spherical equations before the fz = 0 gauge 
choice was made, and again impose that M is constant. The equations that 
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remain are 
g? — 92" =1-2M/r (14.120) 
and 


ðrg =G, Org. =F, (14.121) 


and all fields are functions of r only. The bracket relation of equation (14.35) 
gives 


929; f2 — nOrfi = Gf, — F fe, (14.122) 
from which it follows that 
Or(figi — f292) = Ordet (h) = 0. (14.123) 


The determinant of h is constant, and the value of this constant depends on the 
choice of gauge. Because the Riemann tensor falls off as r7’ 
to work in a gauge where h tends to the identity as r +> oo. In this case we have 
det (h) = 1, so we can write 


we always choose 


figi ~= fz92 =1. (14.124) 


No other equations remain to fix the solution further. We therefore have two 
free functions in the choice of h function. 

A useful alternative to the Newtonian gauge chosen in this section is to write 
the solution in Kerr—Schild form. For this we set 


=1-M/r, = —M/r, 
i / A / (14.125) 
fi=14+M/r, fo = M/r. 
In this case the h function takes on the compact form 
- M 
h(a) =a + —ae_e-, e = 6e — er. (14.126) 
r 


This algebraic form has a number of convenient algebraic features. The first is 
that the solution is of the form of the identity plus an interaction term, as is also 
the case in the Newtonian gauge setup. The second is that this form of h(a) is 
a symmetric function. Finally, e— is a null vector that satisfies h(e_) = e_. All 
of these features can be employed to simplify calculations. 

The line element generated by our general form of h function is 


ds? =(1 — 2M/r) dt? + 2(fige — fog) dt dr — (f1? — fo”) dr? 
—r?(d0? + sin? (0) dd”). (14.127) 
This in effect contains one arbitrary function, because the constraint on the 
determinant fixes one of the two unknown coefficients. The remaining unspecified 


degree of freedom lies in the rotation gauge, which does not affect the metric. We 
can draw an important conclusion about the metric by considering its behaviour 
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at the horizon. There we must have gı = +g2, and we know that fi gi — f292 = 1 
globally. It follows that 


fige = fom =+1 atr= 2M, (14.128) 


so the off-diagonal term must be either +1 or —1 at the horizon. The presence 
of the horizon must break time reversal symmetry. This is to be expected. For 
a black hole (corresponding to the negative solution), the horizon is the place 
where particles can fall in, but cannot escape. The opposite value at the horizon 
(corresponding to a positive value of g2 in the Newtonian gauge) defines an 
object from which particles can escape, but no particle can cross the horizon. 
This is called a white hole, though it is unclear whether such a solution defines 
a physically relevant object. 


14.4 Quantum mechanics in a black hole background 


The gauge theory formulation of gravity is motivated by constructing gauge 
fields to ensure that the Dirac equation is covariant under local rotations and 
displacements. We now study the effects of the black hole gauge fields on a 
Dirac fermion. Assuming that no electromagnetic couplings are present, the 
minimally-coupled equation takes the familiar form 


Dislo3 = mo. (14.129) 


The simplicity of the h field in the Newtonian gauge suggests that this will be 
the simplest gauge to work in. As always, we must ensure that the all physical 
predictions are gauge-invariant. With the gravitational fields as described in 
equations (14.64) and (14.65), the Dirac equation becomes 


2M\'? (A 3 


If we pre-multiply by yo and employ the 7 symbol to represent right-sided mul- 
tiplication by Io3, then equation (14.130) becomes 


1/2 
iy = —iVy + i =) = = (r3/ 4y) + mh, (14.131) 


where 7 = yoyo. We see that the Newtonian gauge has enabled us to write the 

Dirac equation in a very straightforward Hamiltonian form. One reason for this 

simplicity is that the spatial sections defined by the time coordinate t are flat. 
The interaction Hamiltonian in equation (14.131), with all constants included, 


is 
1/2 
r(Y) = ih C) ( oa ) y. (14.132) 
r Or 4r 
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This single term incorporates all gravitational effects exerted by a black hole 
on a Dirac fermion. A number of observations can be made immediately. The 
first is that the interaction Hamiltonian does not depend on the mass of the 
particle, which is how the equivalence principle is embodied in the Dirac equa- 
tion. The second point is that H does not depend on the speed of light. The 
non-relativistic approximation is therefore straightforward, following the tech- 
nique of section 8.3.3. To lowest order in c~! we obtain the Schrödinger equation 
with interaction determined by Hy. For stationary states this equation is 


coy 1 ð 


fi? 
ea r3/4 Or 


2m 


(rty) = Ey, (14.133) 


where w now denotes the Schrödinger wave function. This equation is simplified 
by introducing the phase-transformed variable 


Y = wexp (-i(8r/ac)"?) l (14.134) 
where 
h2 


The distance ag is the gravitational analogue of the Bohr radius for the hydrogen 
atom. The new variable W satisfies the simple equation 


h2 
vv Y = EW. (14.136) 


2m r 


This is precisely the equation we would expect if we used the Newtonian grav- 
itational potential. The solutions for Ų are therefore Coulomb wavefunctions. 
The non-relativistic limit enables us to make two immediate predictions. The 
first is that a spectrum of bound states should exist, with similar properties to 
that of the hydrogen atom. The second is that, in the non-relativistic limit, the 
scattering cross section should be determined by the Rutherford formula. This 
latter prediction is confirmed in the following section. 

The interaction Hamiltonian A, hides a significant feature, which is that it is 
not Hermitian due to the presence of the singularity. To see this we form the 
difference between Hy and its adjoint. With ¢@ and w both Dirac spinors we find 
that 


J Pe Hy())q = VIM | dQ Í T ar (rlt, (19/4) Tors) ¢ 
= f PoE Y + vont f ao [ylos i (14.137) 


where we follow the convention of section 8.1.2. We will see shortly that all 
-3/4, The boundary term at the origin 
therefore does not vanish, and the Hamiltonian is not Hermitian. It follows 


wavefunctions approach the origin as r 


525 


GRAVITATION 


that any normalisable stationary state must have an imaginary component to 
its energy. This is sensible. For all states the covariant current vector is always 
timelike. Inside the horizon this vector must point inwards, towards the singu- 
larity, so current density is inevitably swept onto the singularity. This implies 
that bound states must necessarily decay, so we expect the energy to have an 
imaginary component. 


14.4.1 Scattering 


The Dirac equation (14.131) is ideally suited to a perturbative scattering calcu- 
lation employing the methods of section 8.5. We seek an iterative solution to the 
Green’s function equation 


(ive — Ê(z2) — m) Salz2, 1) = 54(ax2 — 21), (14.138) 


where 


B(x) = iĝo = m & + 2) (14.139) 


As usual, the hats denote operators which act on spinors, and in this section we 
retain the familiar 7 symbol to denote the complex structure. 
The iterative solution to equation (14.138) is given by 


Salars, xi) = Sr(xF, Li) + fate, Sr(£f, £1)B(£1)SF (z1, 2) 
+f d‘x, dizo Sp(£f,£1)B(£1)SF (£1, £2)B(£2)SF (£2, £i) +--+, (14.140) 


where Sp(x2, x1) is the free-field, position-space Feynman propagator. The in- 
teraction term B(x) is independent of time so energy is conserved throughout 
the interaction. Converting to momentum space we find that the scattering 
multivector T;;, as defined in equation (8.229), is given by 


Ëk k+m 


B k) ——____—_ 
3 (P; ) fre ee 


ens B(k,p;) ++ 


Tr = (Pe +m) (2er +f 


(14.141) 
Here B(py, pı) denotes the spatial Fourier transform of the interaction term, 
B(p2,P,) = (2M)? i40 | Bx eoipae l a + 3) epre (14.142) 
SR ri/2 \ðr 4r i i 


where bold symbols refer to spatial components only. To evaluate this we first 
write 


B(pa, pı) = 2M)" ( FFP, ~ pa) + J. (tas) 
A=1 
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where 


f(p) = pear = (7) (14.144) 


We therefore find that the momentum-space interaction is governed by the vertex 
factor 


1/2 P? — py” A 
[po — p|"? 


This factor has the unusual feature of vanishing if the ingoing and outgoing 
particles are on-shell, because energy is conserved throughout the process. It 
follows that the lowest order contribution to the scattering cross section vanishes. 
This is to be expected, as the vertex factor goes as VM, and we expect the 
amplitude to go as M to recover the Rutherford formula in the low velocity 
limit. 

Working to the lowest non-zero order in M the scattering multivector becomes 


B(pp, pı) = 3n*/?i(M) (14.145) 


Tri = —90°M (fs + m)A0L140, (14.146) 
where 


(14.147) 


: -f Ëk pr-k ktm k-p? 
t J (2r) [py — kIT? k? = m? + ic |k = p| 
Here we have explicitly included a factor of ie to ensure that any poles in the 
complex plane are navigated in the correct manner. However, we have 


k? — m? = E? -k — m? = p* — k’, (14.148) 


where E is the particle energy and p° = p;? = p;”. The pole in the propagator 
is therefore cancelled by the vertex factors, so there is no need for the factor of 
ic in the denominator. The integral we need to evaluate is therefore 


dk k? — p? n 
H= k 14.149 
=j E pre + ™) es 


and the result of this integral is 


1 


i — 
9n2q? 


(2m + 3(Bf + Bi) — 440), (14.150) 


where q = p; — p;. The scattering multivector is now given by 


4rM 
a 2 (E(2E +q) +p? + ppp:). (14.151) 


This should be contrasted with the equivalent expression for Coulomb scattering, 
given in equation (8.237). We see immediately that the coupling term goes 
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with the particle energy, rather than its mass. This is because the interaction 
Hamiltonian is independent of m. The unpolarised cross section is given by 


do _ |T}? 
dQ 161? 
2M? 2 2 2 2\2 2 
a. (m (E? — pp- pi) + (2E? — m?)? + 4E*p,-p;). (14.152) 


If we now let v = |p|/E denote the particle velocity, and 0 the scattering angle, 
we arrive at the simple expression 


do _ M? ( 
dQ 4v4 sin*(6/2) 
As demanded by the equivalence principle, this formula depends only on the 


incident velocity, and not on the particle mass. This confirms that the equiv- 
alence principle is directly encoded in the Dirac equation as a consequence of 


1 + 2v? — 3v? sin?(0/2) + vf — vt sin? (0/2)). (14.153) 


minimal coupling. The final cross section formula is gauge-invariant. We can 
perform analogues of this calculation in a range of different gauges, and the same 
result is obtained in all cases. Furthermore, all terms in the result have local, 
gauge-invariant definitions. The mass M can be defined in terms of tidal forces, 
and the velocity v is that measured locally by observers in radial free fall from 
rest at infinity. The angle 0 is the angle between asymptotic in and out states, 
measured locally in the asymptotic regime. 

The cross section of equation (14.153) confirms that the low velocity limit 
recovers the Rutherford formula. The massless limit m +> 0 is also well defined, 
and is obtained by setting v = 1. This produces the simple formula 


do _ M? cos?(0/2) 

dQ sin*(6/2) 
The small angle limit to this gives a cross section going as (4M)?/0*. This re- 
covers the classical formula for the bending of light by a massive source. While 
the calculation here has assumed a point mass source, the small angle limit is 


(14.154) 


appropriate for any localised source of gravitational mass M. The massless limit 
contains a surprise in the backward direction, however. Simulations of scattering 
based on massless particles following null geodesics reveal a large ‘glory’ scat- 
tering in the backward direction. This is absent from the quantum treatment, 
and is a diffraction effect for massless spin-1/2 particles that is not evident at 
the classical level. The scheme described here can be modified to the case of a 
scalar field, and produces the differential cross section 

do 2 = E (1+0) 
dQ  4vtsin*(0/2) 


Again, we see that the equivalence principle is obeyed, and the various small angle 


(14.155) 


and low velocity approximations are retained. The classical cross section contains 
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further structure, attributable to multiple orbits. In the quantum framework 
these effects should be present in the higher-order terms. 


14.4.2 Stationary states and angular separation 


The Dirac equation in the Newtonian gauge is immediately separable in space 
and time, and admits stationary state solutions of the form 


u(x) = W(x) exp(—EtIos). (14.156) 


If the state is normalisable then E contains an imaginary component determined 
by 


V2M 
Im(E) = — lim r3/? | dO (ty), (14.157) 
2N r—0 
where N is the normalisation constant 
N= [ee (tap). (14.158) 


As expected, the sign of the imaginary component of E corresponds to a decaying 
wavefunction. This behaviour is independent of the sign of the real part of E, so 
both positive and negative energy states must decay. For scattering states we do 
not demand that w is normalisable, and can look for solutions where the energy 
is real, with E > m. 

With the time dependence separated out, equation (14.131) reduces to 


Vy — (2M /r) r348, (r34) = iEy — imọ. (14.159) 


To solve this equation we follow the standard procedure for a central potential 
and separate out the angular dependence. This is achieved using the spherical 
monogenics, described in section 8.4.1. We assume that the wavefunction takes 
the standard form of 


j- o tonprolr)os s=1+1, 


plz, K 
orp ulr)os + yrIv(r) «=—(l+1), 


(14.160) 
where « is a non-zero integer and u(r) and v(r) are complex functions of r only. 
On substituting this wavefunction into the Dirac equation (14.159) we obtain 
the pair of coupled radial equations 


(on A a ae (2) ; (14.161) 
where 
a a =m) = re ee E ; 
(14.162) 
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u and uz are the reduced functions defined by 
ul = ru, ug = irv (14.163) 


and the primes denote differentiation with respect to r. The form of this equation 
should be contrasted with the hydrogen atom of section 8.4.3. 
To analyse equation (14.161) we first rewrite it in the equivalent form 


(1 -2M/r) G) = E La A a . (14.164) 


This makes it clear that the equations have regular singular points at the origin 
and horizon (r = 2M), as well as an irregular singular point at r = oo. Unfor- 
tunately, the special function theory required to deal with such equations has 
not been developed. Hypergeometric functions are appropriate for differential 
equations with three regular singular points, or one regular and one irregular 
singular point. An attempt to generalise hypergeometric functions results in 
Heun’s equation, but most techniques for handling this involve series solutions 
and numerical integration, so these are the techniques that must be applied here. 
The presence of the three singular points implies that any power series will have 
a limited radius of convergence, so typically these can only be used to define 
initial data for numerical integration routines. 

A Frobenius series about the origin shows that both uı and u2 approach the 
origin as r!/4. It follows that the wavefunction goes as r—3/4 
as was stated earlier. For normalisable states this behaviour ensures that the 
energy contains an imaginary decay factor. Next we construct a series about the 


near the origin, 


horizon by writing 
wai > apn", uz =n > ben", (14.165) 
k=0 k=0 


where 7 = r—2M. On substituting this series into equation (14.164), and setting 
n = 0, we obtain 


s (fa\_ fl, T k/(2M) i(E +m) —(8M)~1\ (ao 
mla) h dhe-we any”) la) 
(14.166) 
The two values of the index s for which this has non-zero solutions are 


s=0 and s=-—}+4iME. (14.167) 


The s = 0 solution corresponds to an analytic power series with a well-defined 
wavefunction at the horizon. Such solutions are certainly physical. The second 
root gives rise to a wavefunction that is singular at the horizon, and as such is 
physically inadmissible. As a consequence, it is not possible to construct a com- 
plete set of outgoing modes at infinity and in any scattering process some of the 
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wavefunction is lost. This is the quantum-mechanical description of absorption 
by a black hole. 

Before proceeding, we should confirm that the two indicial roots at the horizon 
are gauge-invariant, and not an artifact of our various gauge choices. This is 
important because the singular index can be used to determine the Hawking 
temperature of the black hole. The method we use to confirm gauge invariance 
is quite general and can be applied to a range of situations. We start by keeping 
the gauge unspecified so that, after separating out the angular dependence, the 
Dirac equation reduces to 


Eg. -Li Ui\ fei = G/2 im — F/2 ur 
a :) A ( im — F/2 k/r—G/2) \uzj) ` ces) 
We can still assume that the time dependence is of the form exp(—iEt), so that 
equation (14.168) becomes 


/ 
IL Ia We eed) ems a; (14.169) 
g2 gı Ug U2 


_( w/r-G/2+ifeE  i(mt fiE)— F/2 
B= ( m= fb) 0/2 -—r/r peel (14.170) 


where 


The form of time dependence is gauge-invariant, since the time coordinate is 
defined by the requirement that the Riemann tensor is stationary. Given a time 
coordinate t, a general displacement consistent with this requirement takes the 
form 


tet =t+a(r), (14.171) 


where a is a differentiable function of r. This ensures that stationary states all 
go as exp(—iEt), regardless of the choice of time coordinate. 
Now, since g1? — g2? = 1—2M/r holds for vacuum solutions in all gauges, we 


obtain 
/ — 
(1 —2M/r) e = ( I J B (o) ; (14.172) 
u5 =92 Jı u2 
We again look for a power series solution of the form of equation (14.165), and 
setting 7 = 0 produces the indicial equation 


det ( 1 Pe B- “| =0, (14.173) 
~92 gı 4 r=2M 


where | is the identity matrix. For vacuum fields we know that 
gG — gF = 50,(91? — 92°) = M/r’, (14.174) 
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which is gauge-invariant. It follows that the solutions to the indicial equation 
are 


s=0 and s=—$+4iME(gi fo — gofi)- (14.175) 
But, as discussed in section 14.3.6, at the horizon we have 


(91 fo — gofi) = +1, (14.176) 


with the positive sign corresponding to the black hole case. The indices of the 
Dirac equation are therefore gauge-invariant. Similar arguments can be applied 
to scalar and higher-spin fields. 


14.4.8 Quantum absorption 


We are now in a position to give a full, quantum-mechanical description of ab- 
sorption by a black hole. At the horizon the solutions of the Dirac equation 
separate into two branches, one regular and one singular. The singular branch is 
unphysical and cannot be excited by finite incoming waves. The regular branch 
is finite at the horizon, with an inward-pointing current. This gives rise to ab- 
sorption. To understand this process in detail we need to study the asymptotic 
form of the regular solutions and determine their split into incoming and out- 
going modes. We can then construct an arbitrary incoming mode (typically a 
plane wave) and study the amount of scattered radiation. Any radiation that is 
not scattered is absorbed. 

In absorption and scattering problems we are interested in states with real 
energy E, E > m. For such states the spatial current J is conserved, and for 
angular eigenstates we obtain the conserved Wronskian W: 


W = gluu} + ulug) + gluu + ugus). (14.177) 


This measures the total outward flux over a surface of radius r, and we have 
written W in an arbitrary gauge. At the horizon we see that 


W = =g |u — ug), (14.178) 


and so the flux is inwards for all regular solutions. This is to be expected, as the 
current must point inwards at the horizon. 

For explicit calculations we return to the Newtonian gauge. The radial equa- 
tion (14.164) is straightforward to integrate numerically. We start with a power 
series expansion around the horizon of the regular solution. This allows us to find 
values of u and ug a small distance either side of the horizon. These values are 
then used to initiate numerical integration of the equations, both inwards and 
outwards. To visualise the solutions it is convenient to plot the radial density 
function P(r): 


P(r) = ju]? + |uel?. (14.179) 
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Figure 14.6 The radial density for scattered states. The plots show P(r) 
as a function of radius. The horizon lies at r = 2, and the product mM is 
set to 0.01 in units of m2, where mp is the Planck mass. The modes are 
scaled so that the Wronskian is —1, and only the regular solution is plotted. 
The top two diagrams are for «x = 1, with E = 10mc? (left) and E = 20m? 
(right). The bottom two diagrams are for k = 2, with E = 10mc? (left) 
and E = 20mce? (right). 


In physical terms P(r) is r? times the timelike component of the Dirac current, 
as measured by observers in radial free fall from rest at infinity. It is only in 
the Newtonian gauge that this definition gives rise to the simple formula of 
equation (14.179). 

In figure 14.6 we plot P(r) for a range of energies and angula momenta. The 
plots are for scattering states, so the wavefunctions are unnormalised. For the 
sake of comparison the magnitude of each mode is fixed by setting the Wronskian 
to —1. The gravitational coupling is controlled by the dimensionless quantity 


GMm Mm 


So 
fic ms 


(14.180) 


where m, is the Planck mass. In figure 14.6 we have used a dimensionless 
coupling of 0.01. The chosen energies of 10mc? and 20mc? imply that the modes 
are highly relativistic, and also ensure that the associated wavelengths are larger 
than the horizon size. To understand the asymptotic features of the plots we 
return to equation (14.161) and solve for the behaviour at large r. We find that 
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the solutions behave asymptotically as 
M : 
ui =Gexpt (or + —(m? + 2p”) in(or)) e2iE(2Mr)"/? 
P 


M . 
+aexp—i (or + —(m? + 2p”) mr) e?iE(2Mr)"/? (14.181) 
p 


M ; 
u =g opi (pr + Mn? +2) In(pr)) PMD 
m p 


M . 
po exp —i (or + —(m? + 2p?) in(or)) et EOM)? (14.182) 
p 


E+m 
where p? = E? — m?. The Wronskian is therefore equal to 
2p 2 2 
— — 14.183 
W = -2 (lal? - AP), (14.183) 


and the radial probability P(r) is given asymptotically by 


4m 2M (m? + 2p? 
jui]? + Juz]? =F] z el |6| cos (2v | ( p3 In(pr) 4 60) 
+ Ea mill’ + 181°). (14.184) 


The oscillations predicted by this formula are clearly visible in figure 14.6. The 
magnitudes of a and @ determine the relative amounts of scattered and absorbed 
radiation present for a given mode. With the Wronskian held constant, all modes 
have a constant flux through the horizon onto the singularity. In the large 
r region |a| determines the amount of ingoing radiation, and |6| the amount 
of outgoing radiation. As |a| increases, a smaller fraction of the radiation is 
absorbed and more is scattered. One effect that is clear in figure 14.6 is that as 
the angular momentum increases, for fixed energy, |a| also increases. That is, 
less radiation is absorbed for fixed energy as the angular momentum increases. 
This is precisely the behaviour we expect from classical considerations. 

Given that each mode is normalised such that W = —1, then total absorption 
cross section is given by 


T [k] 

Cabs = i 14.185 

aE m) 2 on S 
where q,, is the value of a for each angular eigenmode. The values of a, are deter- 
mined numerically by integrating the radial equations out to a suitable distance 
from the horizon and matching to the asymptotic forms of equations (14.181) 
and (14.182). Typically, we need to sum over a range of « values before the sum 
settles down to its final result. The result of this sum, for a massive fermion, 
is plotted in figure 14.7. For energies close to the rest energy the absorption 
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Figure 14.7 The quantum absorption cross section. The plot shows the 
total absorption cross section as a function of the incident energy. The 
dimensionless coupling Mm/ mŽ is 0.1, and the energy is plotted in units 
of the rest energy mc?. The horizontal line is the photon limit. 


cross section follows the classical prediction. But at higher energies a series of 
oscillations are present as the wavelength becomes comparable with the horizon 
size. These oscillations take place around the photon limit of 277, and are also 
present for massless particles. The precise form of these oscillations depends on 
the mass of the particle, so represents a quantum-mechanical violation of the 
equivalence principle. 


14.5 Cosmology 


The radial equations we have developed so far are easily adapted to the case of 
homogeneous, isotropic matter distributions. Such distributions provide a good 
model for the large scale distribution of matter in the observable universe. Be- 
fore studying the field equations for such cosmological matter distributions, we 
must first introduce the cosmological constant. This was originally introduced 
by Einstein to allow the construction of static cosmological solutions, and for 
many years had been thought to be an unnecessary additional feature of general 
relativity. But experimental evidence, both from the cosmic microwave back- 
ground and from distant supernovae, now favours models which do include a 
cosmological constant. There are also hints from quantum gravity that a cosmo- 
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logical constant should arise as a form of vacuum energy, though this is not well 
understood. 

We start with the radial equations, as summarised in table 14.2. Inclusion 
of the cosmological constant A only modifies a handful of these equations. The 
mass function M becomes 


M = 5792” — g? +1 — Ar?/3), (14.186) 
and the derivatives of the gi and g2 fields become 


Lig2 = Gq. — M/r? + rA/3 — 4rrp, 


3 (14.187) 
Lg, = Fg2 + M/r* — rA /3 — 4rrp. 
The Riemann tensor is altered to 
1 
R(B) =4r(p + p)B-et er — 3 (870 +A)B 
M 2T 
= (3 ~ 3 ) (B + 30,Ba;,) (14.188) 


and we continue to assume that the matter distribution takes the form of an 
ideal fluid. 

For cosmological models the matter distribution is assumed to be spatially 
homogeneous and isotropic, so that p and p are functions of time only. The mass 
function M is then given by 


4 
M(r,t) = sre, (14.189) 


so the Riemann tensor also depends only on time. The equation for L,p tells us 
that G vanishes, and hence that 


fi=l. (14.190) 


The time coordinate t therefore measures the proper time for observers at rest 
with respect to the cosmological background. The derivatives of M and p simi- 
larly tell us that 


F= r (14.191) 
and 
p= -2 (p+), (14.192) 
For these to be consistent with the relation D,g2 = Fgı we must have 
F= H(t), go(r, t) = rH(t), (14.193) 


where H(t) is a function of time only. The Ligp equation now reduces to a simple 
equation for H(t): 
4a 


A 
H+ R? ae (p + 3p). (14.194) 
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h(a) = a+a-er((gi — le” + H(t)re’) 


The tb hield gi? = 1 — kr? exp(—2 f* H(t’) dt’) 

The w field w(a) = H(t)aAer — (gı — 1)/r a^ (erer)et 
Riemann tensor R(B) = 4r (p + p)B-er et — (8p + A)/3B 
The density 8rp = 3H (t)? — A + 3kexp(—2 fJ H(t) dt’) 


H + H? — A/3 = — (47/3) (p + 3p) 


Dynamical equations À 
7 7 p= —3H(t)(p +p) 


Table 14.3 Equations governing a homogeneous, isotropic perfect fluid. 
The covariant vector e; defines the rest frame of the universe. This is de- 
termined experimentally from the cosmic microwave background radiation. 
No other direction is contained in R(B), and all physical fields are functions 
of time only. 


Finally, we are left with a pair of equations for g1, 


Lign = 0, 


Legh = (a? — Vf. Ea 
The second equation tells us that gı is of the form 
g’ =1+4+r°d(t). (14.196) 
The equation for Lşgı then tells us that ¢(t) satisfies 
b = —2H(t)d. (14.197) 


It follows that gı is given by 


t 
gı? = 1 — kr’ exp (-2 f H(t’) a) (14.198) 


where k is an arbitrary constant of integration which turns out to define the 
spatial geometry. The full set of equations describing a homogeneous perfect 
fluid are summarised in table 14.3. 


14.5.1 Comparison with standard approach 


The derivation of the cosmological equations presented here, as a special case of 
a spherical solution, differs from most presentations. To recover a more familiar 
set of equations we first introduce the distance function S defined by 


H(t) =. (14.199) 
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With this substitution we find gı is now simply 
g? =1—kr?/s?. (14.200) 


Similarly, the H and density equations become 


S 3 3 (14.201) 


These are the Friedmann equations of cosmology. Our derivation has focused 
attention on the Hubble function H(t), rather than the distance scale S(t). This 
is natural, as H(t) is a directly measurable (gauge-invariant) quantity, whereas 
S(t) is only defined up to an arbitrary scaling. 

The Friedmann equations are usually derived by starting with a diagonal line 
element. This is obtained from the radial setup by the displacement defined by 


f(a) = z etet + Sezer. (14.202) 


Under this displacement, h(a) transforms to 


1 
h'(a) = a-ere: + g (ll — kr?)*/2q.¢,e" + ano, or), (14.203) 


and the line element this defines is 
S2 


ds? = dt? — 
j 1 — kr? 


dr? — 8r? (d8? + sin? (0) dd”). (14.204) 


In this gauge we can see clearly that S controls the distance scale, and k controls 
the spatial geometry. We can always choose the scale such that k is either zero 
or +1. A k of zero corresponds to a spatially flat universe, which is favoured 
on theoretical grounds and is consistent with observations. The non-zero values 
correspond to an open universe (k < 0, defining hyperbolic geometry) or a closed 
universe (k > 0, defining spherical geometries). These three spatial geometries 
are the only spatially homogeneous and isotropic models we can consider. These 
geometries are discussed in more detail in chapter 10. Which model is appropri- 
ate for the universe on its largest scales is determined by the present values of the 
density and Hubble function. Most experiments find that the universe is close to 
the critical density (k = 0), but no experiment can ever conclusively prove that 


k is zero. Any slight deviation in the density away from the critical value implies 
that k is non-zero. The fact that the universe is so close to its critical density 
has led theoreticians to propose a range of models which force the universe to 
have k = 0. The most popular of these is provided by inflationary cosmology, in 
which the universe passes through a stage of rapid inflation, so that all spatial 
sections are expanded dramatically and become essentially flat. 
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14.5.2 Density perturbations and cluster formation 


We will not discuss the detailed solutions of the cosmological equations in this 
book. This is a large subject and is covered in detail in a range of modern 
textbooks. Here we discuss an application where the derivation from the radial 
equations is particularly helpful. The problem of interest is the growth of a 
perturbation in a cosmological background. The perturbation is assumed to be 
spherically-symmetric, and the coordinate system is centred on the perturbation. 
To simplify matters further, we ignore the cosmological constant and set the 
pressure to zero. We are therefore dealing with a simple model of a pressureless 
fluid collapsing under the influence of its own gravity. 

Returning to the radial equations in table 14.2, we see that for zero pressure we 
have G = 0 and fı = 1. The matter therefore follows geodesics, and t measures 
the proper time for observers comoving with the matter. The mass satisfies 


LiM =0, (14.205) 


which says that the mass M enclosed within radius r is conserved along the 
fluid streamlines. The operator L+ is clearly the comoving derivative along the 
fluid streamlines. The function gı is also conserved along a streamline, and the 
equations integrate straightforwardly to determine the streamlines (geodesics). 
The form of the geodesic depends on the value of gı, and there are three cases 
to consider: 


1. gı? < 1. This case includes closed cosmologies, and the matter streamlines 
are defined by 
M 
pS = 
Legis 


t-ti= E (n — sin(n) — ni + sin(m)) 


(1 — cos(n)), 
(14.206) 


where 7 parameterises the curve, and 7; is determined from the initial value of 
r at time t;. The velocity g2 is given by 


M 


92 = rd = gi sin(7) (14.207) 


and n; is fixed in the range 0 < ņ < 2a by determining whether the initial 
velocity is inwards or outwards. Setting 7; = m corresponds to starting from 
rest, and provides a simple model for black hole formation. 


2. gı? =1. This case include flat cosmologies, and the equations integrate di- 
rectly to give 
2(r3/2 — ri?) 


a = (14.208) 


539 


GRAVITATION 


The velocity is chosen outwards to avoid a singularity forming instantaneously. 


3. gı? > 1. This case includes open cosmologies. The streamlines are parame- 
terised by 


(cosh(7) — 1), 


S Taci 
ee (14.209) 
t-—t= CRESS (sinh(7) — n — sinh(:) + m) 
and the velocity is given by 
M 
g2 = > Sinh(n). (14.210) 


ro?) 
For this case it is also necessary to start with an initial outward velocity, in order 
to avoid streamline crossing. 


By working globally in the Newtonian gauge we keep simple control over the 
initial conditions. For these we wish to set up a small perturbation in a finite 
region, such that outside the perturbation the system evolves as a homogeneous 
cosmology. This will be the case provided the average density in the perturbation 
matches the external universe. Suppose that the perturbation initially has width 
r; and the external cosmology has initial values p; and H; for the density and 
Hubble function respectively. We introduce the dimensionless variables 


t;) — rH; ‘=p: 
z=, v(x)= Galt ti) ri, f(x) = p(t, ti) = pi (14.211) 
ri ry pi 
The functions f(x) and v(x) are related by 
d 
x? f(x) = ——(a*v(z)), (14.212) 


with both f(x) and v(x) vanishing at the boundary (x = 1). Equation (14.212) 
ensures that the model is correctly compensated, so that the perturbation has 
no effect on the external cosmology. (Equation (14.212) also ensures that no 
decaying modes are present in the perturbation left over from the linear regime.) 
To fix f(x) and v(x) we choose a parameter n, which controls the polynomial 
degree of the functions, and also fix the value of the velocity gradient at the 
origin. The function v(x) is then a polynomial of degree 2n + 1, formed as 
follows. At the centre we set v = 0, and the first derivative is determined by 
the velocity gradient. The remaining derivatives up to order n are set to zero. 
Similarly, at the boundary v is chosen such that gg matches the exterior value of 
rH; up to the first n derivatives. The result is a simple function controlling the 
perturbation, and for each initial value of r the fluid streamlines can be plotted 
easily. An example of these streamlines is shown in figure 14.8. 

If the system is allowed to evolve for a suitable amount of time, it provides 
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Figure 14.8 Matter streamlines for ann = 3 model. The perturbation has 
initial width 1, with H; = 1 and p; = 3/87. The velocity gradient at the 
centre of the perturbation is 0.95. The central region is therefore moving 
inwards relative to the Hubble flow, so recollapses to a singularity after a 
finite time. All units are arbitrary. 


a good model of a cluster of galaxies sitting inside a cosmological background. 
One can then study photon paths in this model, to look for lensing effects, or 
temperature perturbations in the cosmic microwave background. One weakness 
with these models is that no pressure is included, so the cluster has no means of 
supporting itself. This implies that a singularity forms after a finite term (deter- 
mined by the central density and velocity gradient). The model then describes 
a black hole, sitting in an expanding universe. 


14.5.8 The Dirac equation in a cosmological background 


A good illustration of the full gravitational equations, with torsion included, is 
provided by the case of a Dirac field coupled self-consistently to gravity. The 
equations governing this system are 


H(a) = An(wly3W) “A, 
Gla) — Aa = 8r(a-Dyly3)1, (14.213) 
DibIy3 = mw. 
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This system of equations is highly non-linear and extremely difficult to analyse 
in all but the simplest of situations. Here we are interested in cosmological 
solutions, for which all fields are functions of time only. We also restrict our 
discussion to the spatially flat case (k = 0), so that we can write 


h(a) = a + rH(t)a-er er. (14.214) 
The w function is given by 
w(a) = H(t)a^e;, + $ra-S, (14.215) 
where & = 8r and S denotes the spin trivector: 
S = tyly. (14.216) 
After a little work, the Einstein tensor evaluates to 
Gla) = 2Hane, e1 + 3H?a — ika (D-S)+ $ra SS- 3r?S?a, (14.217) 
and the matter energy-momentum tensor is 
T (a) = (a-epbly3y) + Hane, S + ikas S). (14.218) 
Finally, the Dirac equation is now 
(e19: + å Hes + 2KS)pI73 = my, (14.219) 


which has the unusual feature of being nonlinear, due to the presence of the spin 
term. 

We will construct the simplest solution to this system by setting the spinor Y% 
equal to a magnitude and phase only: 


p = p(t) e7 lax), (14.220) 
The Dirac equation therefore reduces to the pair of equations 
) = —3pH, 
A (14.221) 
x=3rp+ m. 
The Einstein equation yields the final pair of equations 
3H? — 127° p° — 8rmp — A = 0, 
(14.222) 


2H +3H? + 1277p? — A= 0. 


The second of these follows from the first and the equation for p. These equations 
are solved by 


= B? 
P 6r sinh((t) (msinh(@t) + 8 cosh(6t)) (14.223) 
where 
pa = (14.224) 
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The initial singularity is chosen to correspond to t = 0. The Hubble function is 
similarly given by 


ae B? + 28 sinh?(Gt) + 2mp sinh(t) cosh(3t) 
() = 3sinh(Gt)(msinh(3t) + Gcosh(St)) ` 


The limit A + 0 is easily taken and gives rather simpler behaviour in the absence 
of a cosmological constant: 


p(t) 


(14.225) 


= 1 l H(t) = 1+ 2mt 
6rt(1 + mt) 3t(1 + mt) 
Antiparticle solutions can also be found, though these can have unusual proper- 
ties. At large times the Hubble function tends to a constant value of (A/3)!/?. 
This behaviour is typical of A cosmologies and leads to the surprising prediction 
that the universe will keep accelerating. The presence of a non-zero spin vector 
implies that these models break isotropy, but this fact is hidden from the line 
element, which remains isotropic. The spin direction is only seen by particles 


(14.226) 


with non-zero spin, which interact directly with the torsion tensor. 


14.6 Cylindrical systems 


We now turn our attention to a different class of exact solutions — those ex- 
hibiting cylindrical symmetry. Such solutions can provide models for stringlike 
configurations, and some of the solutions are also appropriate for gravity in (2+1) 
dimensions. We first introduce cylindrical polar coordinates (t, p, ¢, z), where 

1/2 


tan() = d (14.227) 


p= ((a!)? + (2?)) 5 


and xz” = y"-a. We use the symbol p for the cylindrical distance to avoid 
confusion with the radial coordinate r used throughout this chapter. When we 
come to describe the matter, the energy density is denoted € in this section. The 
coordinate frame defined by cylindrical polar coordinate is 

er = 70, eg = p(—sin(¢) 11 + cos(¢) 72), 


f (14.228) 
€p = cos(¢) y1 + sin(¢) y2, ez = 93, 


and we continue to write db for the unit vector eg/p. As a bivector basis we use 
the set {0p, 0, o3}, where 


Op = Cpe, oo = bet, 03 = Cze4. (14.229) 


We are interested in stationary fields that exhibit cylindrical symmetry. For 
these we can write a general h function as 


h(e*) = fie’ + pfe’, h(e*) = ge, 
Mg h 


j 14.230 
h(e?) = phie? + hoet, ( ) 
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where all of the arbitrary functions depend on p only. A suitable w field consistent 
with this h field is given by 


wi = w(er) = —To, + (K + ho)Ios, 
wp = wlep) = Kog, 
Se he i (14.231) 
Ws = wle) = Kop + (hi = G)Io3, 
w, =w(ez) = 


Again, the new scalar functions appearing here (T, K, K, G) are functions of 
p alone. Since all expressions involving L, must vanish, there are only three 
non-vanishing commutation relations to construct. These are 


[Lp, Li] =TLit+ (K + K)Lzg, 
[Lo L3] =-(K - K)L,- GL, (14.232) 
|Le, L3] = 0. 


Since neither L; nor L ô contains derivatives with respect to p, the bracket rela- 
tions immediately yield 


Lota fet (K+ K) fa, 
Lpf2 = -Gf — (K — K) fı, 
Lah = -Għ — (K — K)ho, 
Lpho = Tho + (K + K)hi. 


(14.233) 


The cylindrical derivative L, is given by Lp = gı (p). We can always make the 
position gauge choice gı = 1, though this is not always the simplest gauge to 
work with. 

The Riemann tensor takes the general form 


R(dp) = aid, + Glos, 
R(Io3) = aglo3 = Bop, (14.234) 
R(o¢) = Q304, 


where the scalar functions are defined by 


a, =—-L,T+T? —K(K +2K), 
az = LG +G? — K(K — 2k), 
az = K? — GT, 
B=L,K+G(K+K)-T(K—K). 


(14.235) 
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The same functions appear in the Einstein tensor, 


Glet) = aze: — Bd, 

S(ep) = =E (14.236) 
G($) = -019 + Ber, 

Ge.) = —(01 + az + as)ez. 


It is a feature of gravity in (2 + 1) dimensions that all of the information in the 
Riemann tensor is also contained in the Einstein tensor. That is, there is no Weyl 
tensor in three dimensions. It also turns out that no additional new information 
is obtained from the Bianchi identities, which are satisfied automatically from 
the equations we have already constructed. 

The h function of equation (14.230) contains a single rotational gauge freedom, 
which is the freedom to boost in the og plane. If we make the physical assump- 
tion that the matter energy-momentum tensor has a future-pointing timelike 
eigenvector, the gauge freedom can be used to set this eigenvector to the e; di- 
rection. Once this is done all the rotational gauge freedom in the problem has 
been removed, and we are left with a complete set of field equations. These are 


-L,G — G? + K(K — 2K) = 87e, 
K? — GT = 8rP,, 
—L,T+T? — K(K +2K) = 81r P}, 
L,K+G(K+K)-T(K —K)=0, 


(14.237) 


where € is the matter density, and P, and P are the radial and azimuthal 
pressures respectively. The coefficient of G(e,) is determined algebraically by the 
other three coefficients, and the same must therefore be true of the matter energy- 
momentum tensor. It follows that the z-component of the Einstein equations 
contains no new information. Of course, if we were working in a genuine (2 + 1) 
system, the e, equation would not be present. 


14.6.1 Vacuum solutions 


In the vacuum region all of the scalars {a1, a2, a3, 3} are zero, so we are still 
free to perform an p-dependent boost in the og direction. This freedom can be 
employed to set K to zero. It is also useful in this region to work in a gauge 
where gı = 1. In this case the vacuum region is described by the simple pair of 
equations 


ô G +G?’ -GT =0, 


: (14.238) 
0,T —T? + GT =0, 
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with K determined by K? = GT. On subtracting these equations and integrating 
we see that 


G—T=1/(p+ po), (14.239) 


where po is an arbitrary constant of integration. Similarly, adding the equations 
and integrating yields 


G+T=c/(p+ po), (14.240) 


where c is a second constant of integration. 
The restriction that GT = K? > 0 means that c? > 1, and we can set 


c = +cosh(2a). (14.241) 


There are two distinct vacuum configurations, depending on which sign is chosen 
for c. In either case, the constant a can be gauged to zero with a further constant 
boost in the og direction (which does not reintroduce a K term). The two 
vacuum sectors are therefore characterised by the solutions 


type I: yas T=K=K=0, 
h A (14.242) 
type II: T=— , G=K=K= 
P + po 


All other vacuum solutions can be reached from this pair by p-dependent boosts 
in the og direction. No globally-defined gauge transformation exists between 
these solution classes. For both solutions the Riemann tensor vanishes, since 
there is no Weyl tensor for three-dimensional systems. It is therefore possible 
locally to gauge transform all of these fields to zero, but this is not possible 
globally. In this sense the solutions represent two distinct topological structures. 


14.6.2 Physical properties of matter solutions 


The key physical properties associated with matter solutions are the acceleration, 
vorticity, shear and angular momentum of the string. Given that we have chosen 
a gauge where the timelike eigenvector of the energy-momentum tensor is e+, the 
acceleration vector w is defined by 


w = e:D e, = —Te,. (14.243) 


This measures the extent to which particles comoving with the matter (with 
velocity e+) depart from geodesic motion. The vorticity bivector w is defined by 


w = Dre, + wre = —(K — K)Io3. (14.244) 


The definition ensures that w satisfies e;-~@ = 0. To define the shear tensor we 
require the linear function H that projects vectors into the 3-space orthogonal 
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to et, 
H(a) = a — a-e; er. (14.245) 
In terms of this function the shear tensor o(a) is defined by 


a(a) = 4 (H(a)-Dex + H(dp)(b-Dez)-a) — 3H(a)D-e 


= —3(K + K) (aep + abep). (14.246) 


This is a symmetric, traceless linear function. We see that acceleration is con- 
trolled by T, the vorticity by (K — K) and the shear by (K + K). In the matter 
region all of these scalar quantities are physically measurable functions. The 
same is true of the fourth function, G, which can be determined from the radial 
pressure. 

The remaining physical property of relevance is the angular momentum con- 
tained in the fields. The vector gg is a Killing vector for cylindrical solutions, so 
the vector T (gq) is covariantly conserved. It follows that 


V-(h(Z(gs))det (h)~*) = 0. (14.247) 
The total conserved angular momentum per unit length in the e, frame is there- 


fore given by the expression 


Ps 
Js =} d’x g'-T(gg)det (h)~', (14.248) 
0 
where ps is the string radius. In the gı = 1 gauge this expression evaluates to 
give 


Js = —20 fe dp (e + Ps) fife(fiha — foh2)~?, (14.249) 
0 


which shows that a non-zero fə is required for angular momentum to be present. 


14.6.3 Cosmic strings 


Cosmic strings are an example of topological defects that can occur as a remnant 
of symmetry breaking processes in the early universe. They have zero radial and 
azimuthal pressures. It follows that there is a negative pressure along the length 
of the string — they are under tension. The energy-momentum tensor is 

T (a) = $e(a— Ioga Ios). (14.250) 
From the Einstein equations we see that aj = a3 = 9 = 0, and the Riemann 
tensor therefore has the compact form 


R(B) = 8re(Blo3)1o3. (14.251) 


Tidal forces are only exerted in the Io plane and are controlled by the density. 
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The Einstein equations tell us that T = K = K = 0, so all that remains is the 
single equation 

L,G+G? = —8re. (14.252) 

The full solution is then recovered by integrating the bracket equations (14.233). 

These imply that both fı and hz are constant. A global rotation can therefore 


be performed to transform to a gauge where fı = 1 and ha = 0. The remaining 
equations are 

Lyhy = —Ghi, Ly fo = —G fo. (14.253) 
It follows that fg = Ahı, where A is an arbitrary constant. But ph; must tend 
to 1 as p++ 0 so that h(a) is well defined on the axis. It follows that hı, and 
hence fz, must diverge as p~!. For f this would imply that h(e*) is singular on 
the axis, which is not permitted. It follows that the constant A must be zero, so 
the string has no angular momentum. This agrees with the fact that the shear 
and vorticity are both zero. Pressure is necessary for strings to have any angular 
momentum. 

We have now restricted h(a) to the simple form 


h(a) =a + (gı — 1)a-ep e° + (phi — 1l)a-eg €f, (14.254) 
and the remaining equations are 
L,hy = —Ghi, L G = —8ne — G”, (14.255) 


with Lp = 910). To complete the solution we must make a gauge choice for g1. 
An obvious choice is to set gı = 1, so that p measures the proper radial distance 
from the string. A slightly simpler alternative is to choose a gauge such that 
h(e®) = e?. This requires that 

hı = 1/p (14.256) 


and it follows that 
G = gı /p. (14.257) 


The equations now integrate to give 


P 
n=1- f 16rse(s) ds, (14.258) 
0 


where the constant of integration is chosen so that h(a) is well defined on the 
axis. On defining 


P 
M(p) = | 27se(s) ds, (14.259) 
0 
the solution can be summarised neatly by 
h(a) =a + ((1—8M(p))'/? — 1)a-ep e°. (14.260) 
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The choice of density function is arbitrary, provided 8M(p) < 1. In the vacuum 
region outside the string we have T = K = K = 0, so the vacuum region is 
described by a solution in the gauge class of type I. This can be described in 
terms of a flat spacetime, with a wedge of spacetime removed and the edges 
identified. This topological picture of a string defect can be used to provide a 
qualitative understanding of many of the string’s properties. 


14.6.4 Rigidly rotating strings 


The simplest models that include pressure are those for a two-dimensional ideal 
fluid, with P, = Py = P. The two natural physical models to consider are those 
where the fluid is vorticity-free (K = K) or shear-free (K = —K). The latter 
case corresponds to a rigidly rotating string, and is the situation we analyse here. 
The equations governing this setup are (in the gı = 1 radial gauge) 


0,K —2KT =0, 
3G + G? = —87e+3K?, 
ôT — T? = —8TP + kK’, 
K? — GT = 87 P. 


(14.261) 


These can be solved once the density distribution has been specified. A choice 
of density that produces a straightforward solution is 


8re = 3K? + 3°, (14.262) 


where A is an arbitrary positive constant. This ansatz ensures that the density 
is always positive. The equations for G and T can be solved immediately to give 


_ Acos(Ap) _ Asin(Ap) 


= ——_ = 14.263 
sin(A\p) ’ cos(Ap) + A’ ( ) 
where A is a constant satisfying A < —1. 
We next solve for K to obtain 
B 
K = — ~ 14.264 
(A + cos(Ap))?’ ( ) 


where B is a further constant. The density and pressure can now be recovered 
from equations (14.261). The boundary of the string occurs where the pressure 
vanishes, and this must be reached before p = 7/A. Finally, we return to equa- 
tions (14.233) to find a suitable form for the h function. First we see that f;/h2 
is a constant, so that a gauge transformation can be performed to set hg = 0. 
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The remaining functions are easily found by integration: 


-. AA 
A= cos(\p) + A’ 
AE A 
i= mO (14.265) 
ye -B(fi” - 1) 


A(A + 1)sin(àp) 
For fı the arbitrary time-scale factor has been used to set fı = 1 on the axis. It 


is simple to verify that this solution is well defined on the axis of the string. For 
completeness, the corresponding line element is 


(cos(Ap) + A)” 5 2B 
ata + aay asl 

sin? (Ap) (: Be(1- cos(àp)) (2A +1+ cos(Ap))” 
à? A2 sin? (Ap) (1 + A)4(A + cos(Ap))” 


ds? = 


cos(Ap)) (2A + 1 + cos(Ap)) dt dọ 


Jas dp? — dz”. 
(14.266) 


The exterior vacuum fields can be found simply by returning to the vacuum 
equations, and solving these in the case where K + K =0. The general form of 
vacuum fields outside a rigidly rotating string is then given by 


—a? 


(o + po) ((o + po)? — a?) ’ 

pac Pra (14.267) 

(P+ po)? -a 

a 
(0+ po)? — 0" 
where po and a are constants to be determined by the fields at the boundary. 
This solution falls into the second class of vacuum solutions, as defined by equa- 
tion (14.242). The h function is determined by 


fr = —(1 + A)(a/B)'/?((p + po)? — ey 
= appre (io oo)? 22)" 

Sense (p + po) (14.268) 

h= horp) (fi 1). 


1 


? 


These fields have an unusual property. At large distances, fı falls off as p7 
whereas f tends to a constant value. Beyond the point where the magnitude 
of fọ overtakes that of fı, a closed circular path orbiting the string becomes 
timelike. This solution admits closed timelike curves, even out at infinity. Such 
solutions are often thought of as unphysical, due to the bizarre acausal effects 
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they would allow. But there is nothing outrageous in the matter distribution 
used to generate the solution, and it is difficult to pin down a precise statement 
of what constitutes a ‘physically acceptable’ matter distribution. 


14.7 Axially-symmetric systems 


As a further application of the gauge theory treatment of gravity, we now turn 
to the equations governing a stationary axisymmetric system. Such fields are 
produced by rotating stars, galaxies and black holes, and as such are of consid- 
erable importance in astrophysics. The prototype axisymmetric configuration is 
described by the Kerr solution, which uniquely describes the fields produced by 
an uncharged rotating black hole. The more complicated problem of finding the 
fields outside a rotating massive object such as a star or planet has yet to be 
fully solved. Here we discuss two forms of the Kerr solution. The first continues 
the solution strategy adopted in the cylindrical setup, and can be generalised to 
include matter fields. The second form generalises the Newtonian gauge for the 
Schwarzschild solution, and has a number of significant features. 


14.7.1 Intrinsic form of the axisymmetric equations 


We employ a standard spherical-polar coordinate system to describe axisymmet- 
ric fields, and the notation is precisely as defined at the start of section 14.2. A 
suitable form of the h function consistent with axial symmetry is 


h(e*) = fie’ pz fae®, 
h(e”) = gie" + g3e’, 
nt J oe. (14.269) 
h(e’) = ije” + ize”, 
h(e®) = hie? + hot, 
where all of the variables { f1, . . . , 73} are scalar functions of r and 0. The labelling 
convention for the {f;,...,%2;} is chosen to allow for a more general parameteri- 


sation appropriate for time-dependent systems. We have ignored the possibility 
of any coupling between the et and e”, so strictly speaking are looking for the 
fields outside an extended source with no horizon present. On solving the vac- 
uum field equations we will construct a form of the Kerr solution, which will 
turn out to be ill defined at the horizon. As with the Schwarzschild solution, the 
singular nature of the fields is a consequence of a bad gauge choice, rather than 
an intrinsic property of the fields. In section 14.7.3 we give a form of the Kerr 
solution which avoids this problem. 

A suitably general form of w function consistent with the h field of equa- 
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tion (14.269) is given by 
) = —(T + TJ)epe, — (S + 1K) be; + helo, 
w(e,) = (S + IK’ )e,6 — ize,6, 
w(6) = (G! + IJ')e,6 — (i1 /r)erð, 
w(o) = (H + IK)ĝb + (G + IJ)erh + hı/(r sin(0)) Io3. 


The variables written in capitals are also functions of r and 0, except for the 


w(er 


(14.270) 


pseudoscalar I. The reason for the labelling scheme will become clearer when 
the final set of equations is derived. There are 40 independent scalar variables 
in gravity, so it is difficult to construct a labelling scheme that does not con- 
flict with existing conventions somewhere. A significant feature of our scheme 
is that a complex structure naturally emerges, generated by the pseudoscalar T. 
It is a well-known feature of the Kerr solution that it is underpinned by a com- 
plex analytic structure. The origin of this lies in the natural complex structure 
of spacetime bivectors. Throughout this section we use complex to refer to a 
combination of scalar and pseudoscalar quantities. 
The bracket structure defined by our choice of the w function is 


[Le Lr] = -T Li- (K + K’)Lg, [Lr Lg] = —S'L, — G' Lg 
“a ôl = -SLi + (J — J')L3 [L,,L3] = —(K —K')L,—GL 3, (14.271) 


The Riemann tensor generated by these fields is complicated and, rather than 
giving its full algebraic expression, it is simpler to consider the general form. 
This can be written as 


Ror) = aror + b109, R(Io,) = aslo, + Baloo, 
Ri(o9) = a209 + b207, R(Io9) = aslog + BsIo,, (14.272) 
R(o¢) = a3%¢; R(Lo¢) = alos, 


where each of the a; and 8; is a complex combination. If we now specialise to 
the case of vacuum solutions, so that the Riemann tensor is determined solely 
by the Weyl tensor, the duality relation W(IB) = IW(B) immediately sets 


Q,=Q4, A2=A5, A3=A6, BP, =G4 Bo = fs. (14.273) 


In addition, for a vacuum solution R(B) must be symmetric and traceless. The 
most general form of tensor consistent with this requirement is 
R(or) = aor + Boo, 
R(o9) = a209 + bor, (14.274) 
R(o4) = —(a1 + a2) 09, 
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with a; and 8 complex combinations. 

Next we consider the rotational gauge freedom in our choice of axisymmetric 
fields. We are free to perform a rotation in the Jog plane, and a boost in the og 
direction. These can be summarised in the single rotor R: 


R = exp(wlag/2), (14.275) 


where the scalar + pseudoscalar quantity w is an arbitrary function of (r, 0). This 
gauge freedom can be employed to diagonalise the Riemann tensor by setting 
Ê = 0. This removes all of the gauge freedom present, and enables us to write 


Ror) = œar, Roe) = a206, Roe) = —(a1 + a2) o¢. (14.276) 


The form of the Riemann tensor for the Schwarzschild solution is algebraically 
special, in that two of its eigenvalues are degenerate. This is referred to as having 
Petrov type D. There is no reason to expect the same to be true for axisymmetric 
fields, and the field outside a general rotating star is almost certainly not of 
type D. But it turns out that, if a horizon is present, the solution must be of 
type D. As we are interested here in deriving the Kerr solution, we therefore 
impose the additional condition that the Riemann tensor is degenerate, with the 
general algebraic form 

a 


2 


with a a scalar + pseudoscalar quantity. This final restriction on the form of 
R(B) is not a gauge choice — it is a restriction on the form of solution we can 
construct. 

Comparing the general form of equation (14.277) with the explicit Riemann 
tensor constructed from the w field, we establish that 


R(B) = —(B+30,Bo;), (14.277) 


a=(G+IJ\(T+IJ)+(S+1K)(H+I4K). (14.278) 
The remaining identities reduce to a series of equations, an example of which is 


L,(G+ IJ) =(S'+IK'’-S—IK)(H+1K)-I(K — K')(S+ IK) 
— (G+T+IJ\(G+IJ). (14.279) 
In all there are ten equations of this type. They all relate intrinsic derivatives of 


the variables in the w field to quadratic combinations of the same variables. By 
forming suitable combinations of these equations we find that 


L,a = —3a(G+ IJ), Lga = —30(S + IK), (14.280) 


so the intrinsic derivatives of a are quite simple. 
Next we must consider the Bianchi identities. These contain higher order 


consistency relations between the h and w fields. For the Schwarzschild and 
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cylindrical cases these contained no new information, but this is not the case for 
the axisymmetric setup. If we consider the equation 


DR(o;) — OaR(a-Doy) = 0 (14.281) 


we obtain the pair of equations 


3 
L,a = -32G +IJ4+G'+IJ"), 
i (14.282) 
Lja = =32(S + IK+ 9 +IK'), 
Comparing these with equation (14.280), we see that 
G@ +IJ'=G+TJ, S+iIK'=S+IK. (14.283) 


This simplification for type D fields explains our choice of notation of primed 
and unprimed variables. 
With four of the variables now solved for, the remaining equations simplify to 


L,(G+IJ) =-(G4+ 1S)? -T(G+1J), 
L, (T +1IJ)= (S+ IK}? — (24+ IJ) -T)(T+1J) 
—2S(H +IK), (14.284) 
L,(S+IK) = —-IJ(S + IK) -21K(G+T1J), 
L,(H + IK) =-(G+IJ)\($+1K)-G(H+1I1K) 


and 
Lj(S+1K) =(S+IK)?+H(S+I1K), 
Lj(H+1K) =—-(G+IJ)? + (2($+ IK) — H)(H+ IK) 
+ 2G(T + TJ), (14.285) 
Lj(G+I1J) =1K(G+I1J)+21)(S +14), 
Lj(T+IJ) =(G+II\(S+1K)+S(T+1J). 


These equations are all consistent with the bracket structure, which now takes 
the form 


lr, Lo] = Sin = GLọ. (14.286) 


Our set of equations is now complete. We have explicit forms for the intrinsic 
derivatives of all of our variables; these are all consistent with the bracket struc- 
ture, and the full Bianchi identities are all satisfied. We have achieved the first 
main goal of the intrinsic method. 


14.7.2 The Kerr solution 


The vacuum equations summarised in equations (14.284) and (14.285) display a 
number of remarkable features. They are naturally complex, with the spacetime 
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pseudoscalar as the unit imaginary, and there is a clear symmetry between the 
r and 6 equations. We now demonstrate that, subject to certain boundary 
conditions, these equations admit a unique, two-parameter family of solutions. 
This is the Kerr solution. The proof is constructive, but it is slightly involved 
and we will skip some of the details. 

The first step in solving a set of intrinsic equations is the identification of 
suitable integrating factors. To find the first of these consider the function 


Z = Zat, (14.287) 
where Zo is an arbitrary complex constant. The function Z satisfies 
L,Z = (G+ 1J)Z, LgZ = —(S + IK)Z. (14.288) 
On separating Z into modulus X and argument x, 
Z = Xe!X (14.289) 
we find that 
L,X = GX, LgX =—SX. (14.290) 


It follows that X acts as an integrating factor for G and S. But if we recall the 
bracket of equation (14.286), we see that 


[XL,, XL,] =0. (14.291) 


We have therefore constructed a pair of commuting derivations. This is sufficient 
to ensure that we can fix our displacement gauge freedom by setting g3 = i3 = 0. 
With this done, we can then write 


XL, =G(r)Or, X La = i()0p, (14.292) 


where g(r) and i(0) are arbitrary functions that we can choose with further gauge 
fixing. 
More generally, if a pair of variables A and B satisfy the equation 


L;A—L,B=GB+SA (14.293) 
then an integrating factor C exists defined (up to an arbitrary magnitude) by 
L,C=AC, LC = BC. (14.294) 


One such pair is T and —H. For these we define the integrating factor F, 
satisfying 


L,F=TF, Lyk = —-HF. (14.295) 


With the integrating factors X, Z and F at our disposal, we can considerably 
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simplify our equations for G+ IJ and S + IK to obtain 
L,(FZ(G+IJ))=0,  L(XZ(G+I1J)) =-2XZ(SG + JK), 


(14.296) 
Lj(FZ(S+IK))=0,  L,(XZ(S+IK)) =2XZ(SG+ JK). 


These equations focus attention on the quantity SG + JK. On forming the 
derivatives of this quantity we see that 


L,(XF(SG+ JK)) = Lj(XF(SG+ JK)) =0, (14.297) 


and it follows that XF(SG + JK) is a constant. For the Schwarzschild solution 
this constant is zero. We therefore expect that this term should also vanish 
for a rotating source since, at large distances, the fields should tend to the 
Schwarzschild case. It turns out that one can construct solutions with X F(SG+ 
JK) # 0, but these are appropriate for an infinite disc of matter and not a 
localised source. As we are looking for the fields outside a localised rotating 
source, we can set 


SG+JK =0. (14.298) 


It follows that 
XFZ?(G+IJ(S+IK)= Ci, (14.299) 
where C; is an arbitrary complex constant. 


Remarkably, we are now close to a complete solution to the problem. Equa- 
tion (14.296) tells us that we can set 


FZ(G+IJ)=W(0), FZ(S+IK)=U(r), (14.300) 


where U and W are complex functions of r and 0 respectively. If we now form 


W(0)  G+IJ_ SJ-GK 


= = 14.301 
U(r) S+IK S2 +K?’ (esUt 


we see that the result is a pure imaginary quantity. It follows that W and U are 
a/2 out of phase and, since U and W are separately functions of r and @, their 
phases must be constant. Next we construct the derivatives of Z to obtain 


XL,Z = g(r)0,Z = XZ(G + IJ) =C\/U(r) (14.302) 
and 
XL;Z = 1i(9)09Z = XZ(S+ IK) = Cı/W (0). (14.303) 


It follows that Z must be the sum of a function of r and a function of 0. Fur- 
thermore, these functions must also have constant phases, 7/2 apart. Since the 
overall phase of Z is arbitrary (Z was defined up to an arbitrary complex scale 
factor), we can write 


Z = R(r) + 1V(8), (14.304) 
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where R(r) and (0) are real functions. These satisfy the equation 


Z 


XL, (XL, ln(Z)) + XL;(XL;n(Z)) = a = E 


(14.305) 


which is to be solved for R and Y. 
There is considerable gauge freedom in equation (14.305), since we are free to 
choose the functions g(r) and 7(@). The most convenient choice of gauge is to set 


Z =r — Iacos(@). (14.306) 


The remaining functions are then found by integration. The end result, after a 
series of further gauge choices, is the Kerr solution in the form 


Blet) = gt — a ae ar sin(6) F 
(e ) =g pA12 1 p ’ 
$ A 1/2 
h(e") = g" = e”, 
(14.307) 
h(e®) = g =e, 
P 
he?) = a? =e? E. 
h(e?) =g =F EARS 
where 
P = X? = r?° + a° cos? (0) (14.308) 
and 
A=r?—2Mr+a’. (14.309) 


The mass is given by M, and the angular momentum by aMc. The quantity a 
is the angular momentum per unit mass, and has dimensions of distance. The 
limit a+ 0 recovers the Schwarzschild solution in the form appropriate for the 
exterior of a non-rotating star. The reciprocal vectors are 


re a 
= et — —€¢, 
; $ 
g ae 
r 1/2 Ta 
a í (14.310) 
go = —€ọ, 
r 
r? +a? aA1/? sin? (0) 
Jo = Eg Ct. 
rp p 
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The variables controlling the w field are given by 


A1/2 
GIJ = N 
p(r = Iacos(0)) 
ER —Iasin(0) , 
p(r — Iacos(@)) (14.311) 
r—M 
FOES AE 
cos(@) 
psin(@) 


The equation for T shows that a horizon exists where A = 0. The fact that the 
solution is singular there is a reflection of our choice of time coordinate. This 
measures the time for observers at a constant distance from the source. Such 
observers cannot exist inside the horizon, and the solution breaks down there. 
As with the Schwarzschild system, the resolution of this problem is to express 
the fields in terms of a different time coordinate. 

The Riemann tensor for the Kerr solution can now be written in the compact 
form 


M 
R(B) = 2 =n <os(8))° (B + 30,Bo,). (14.312) 


This is obtained from the Schwarzschild solution by simply replacing r by the 
scalar + pseudoscalar combination r — Iacos(@). Precisely such a replacement 
can be used to generate the Kerr solution using a ‘complex coordinate transfor- 
mation’ in the Newman-—Penrose formalism. This transformation does produce 
the Kerr solution, but there is no a priori reason to expect that such a trans- 
formation applied to a vacuum solution will generate a new vacuum solution. 
Our extremely compact form of the Riemann tensor for the Kerr solution is a 
significant advantage of the gauge theory approach to gravitation advocated in 
this book. The comparison with the standard tensor formulation of general rel- 
ativity is dramatic — most textbooks devote nearly a page to listing all of the 
components of the Riemann tensor, if they list them at all. 


14.7.8 A Newtonian gauge for the Kerr solution 


The form of the Kerr solution developed in the preceding section gives rise to 
a metric that expresses the geometry in terms of Boyer—Lindquist coordinates. 
Such a form is only appropriate for the region outside an extended object. If a 
horizon has formed we must find an alternative gauge choice which covers the 
horizon smoothly. From our discussion of the Schwarzschild solution, we would 
like to find an analogue of the Newtonian gauge appropriate for rotating black 
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holes. Such a gauge does exist, though it is not straightforwardly obtained from 
the Boyer—Lindquist setup. 

The first step in expressing the Kerr solution in a Newtonian gauge is the 
introduction of spheroidal coordinates (7,0,@), as described in section 6.2.2. 
The spheroidal coordinates are related to their spherical counterparts (r, 0, @) as 
follows: 

-2 251/2 (A p 

(F? + a?) sin(0) = rsin(0), (14.313) 

F cos(@) = r cos(0). 

The scalar parameter a is the same as that controlling the angular momentum. 
In the limit a + 0, the barred coordinates reduce to their unbarred spherical- 
polar equivalents. Surfaces of constant 7 are ellipses in flat space, though a 
statement such as this relates to the properties of the coordinate system, and 
not necessarily to physically measurable features. It is convenient to introduce 
the hyperbolic coordinate u, defined by 


asinh(u) = 7. (14.314) 
The coordinate frame vectors are given by 


er = tanh(u) sin(4) (cos(¢) yı + sin(¢) 72) + cos(4) 73, 


3 i i — (14.315) 
eg = acosh(u) cos(4) (cos(¢) 71 + sin(¢) y2) — asinh(u) sin(8) 73 


with eg unchanged from its spherical definition. We also define the unit vectors 
é = ———er, êp = ez, (14.316) 
p 
where p is defined by 
P = a’ sinh? (u) + a? cos? (8) = 7? + a? cos? (0). (14.317) 
The unit frame vectors satisfy 
e,êrêgọ = I. (14.318) 


The Newtonian gauge form of the Schwarzschild solution, defined in equa- 
tion (14.65), contains the unit vectors e, and er. The generalisation of this 
function to the Kerr solution is given by 


2 2M7 \ 
h(n) =n — (5) NEF UV, (14.319) 


where the vector argument is denoted by n to avoid confusion with the scalar 
parameter a. The timelike velocity vector v is defined by 


v = cosh( 8) e; + sinh( 8) @ (14.320) 
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where 
sin(@) ar sin(0) 
tanh(3) = = ; 14.321 
anh(?) cosh(u) 7? +a? (14321) 
It follows that 
sh sin(O 
cosh(3) = ee dine Se (14.322) 
p 


Comparison with equation (14.65) shows how the various terms are generalised 
in moving from the Schwarzschild to the Kerr solution. 
The w(a) function generated by equation (14.319) has 


w(er) = 0, 

a —_ M ê ^ 
w(ér) = a(F — Iacos(8))2" 

- (14.323) 
w(êg) = Palaro] eêg^v, 

k a 
RS cosh(3)(7 — Ta cos(0)) k 

where 
F)1/ 
= AE (14.324) 


The terms in the w function also neatly generalise their counterparts in the 
Schwarzschild solution. In particular, the fact that w(e;) vanishes implies that 
e, satisfies the geodesic equation. The trajectories defined by this velocity define 
a family of observers whose proper time is given by t. 

The remaining covariant object to construct is the Riemann tensor. If we 
define the unit bivector 


N = êr ^v, (14.325) 
then the Riemann tensor takes on the simple form 
M 


R(B) = en + 3NBN). (14.326) 


This is obtained from the form of equation (14.312) by a displacement (taking 
the unbarred to the barred coordinates) and a boost from e; to v. Both are gauge 
transformations, so the intrinsic information in equations (14.312) and (14.326) 
is precisely the same. The same transformations are involved in taking the h(a) 
function from the form of equation (14.307) to that of equation (14.319). In 
addition, further (singular) transformations are also required to convert t to the 
time measured by a set of infalling observers with covariant velocity ez. 
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14.7.4 Geodesics and the horizon 


The h and w fields for the Newtonian form of the Kerr solution are well defined 
over all spacetime, down to the ring F = cos(@) = 0. There are no problems with 
motion through the horizon, and infalling observers reach the central singularity 
in a finite coordinate time. This is because the coordinate t now measures the 
proper time for a family of free-falling observers with covariant velocity e+. The 


trajectories defined by this velocity have 
x’ = h(e,) = e; — aêr = e, — Ter. (14.327) 


This defines a family of observers all infalling along directions with constant 0 
and @¢, and with infall velocity 


2 472 
P= (25) (14.328) 


This family neatly generalises the observers in radial free fall from rest at infinity 
employed in the Schwarzschild solution. As in the spherical case, many physical 
phenomena are simplest to interpret when expressed in terms of observers with 
covariant velocity e+. A curious feature of these observers is that they appear to 
‘slow down’ as the singularity is approached, though they do reach 7 = 0 ina 
finite proper time. 

The next task is to locate the horizon in our new form of the Kerr solution. 
A horizon marks the boundary between regions where one cannot signal to the 
other. This occurs where it is no longer possible to send null photons outwards. 
If k denotes the covariant photon velocity, with k? = 0, a horizon will occur 
when it is no longer possible to satisfy 


éx-h(k) <0. (14.329) 


The left-hand side of this inequality can also be written as 


B 2MF \"? 
Êr 5 a Ex ay e +k. 14. 
h(éz)-k (: + (3 e) r) (14.330) 


It is not possible for two future-pointing null vectors to have an inner product 
less than 0, so the horizon occurs at 


2Mr 
aaa 1. (14.331) 
This defines a quadratic equation, with two solutions when a < M, one when 
a= M and no solutions for a > M. In the case where a < M, the outer horizon 
defines an event horizon. Photons can cross this on an inward trajectory, but no 
photons can escape. The inner horizon is slightly different. On the inside of the 
inner horizon it is possible for photons to travel outwards, but they cannot cross 
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the horizon. Instead, they pile up just inside the boundary, forming an unstable 
Cauchy horizon. 

Instead of considering observers attempting to exit to infinity, suppose instead 
that we look for observers at rest with respect to the background (7,6, ġ) coor- 
dinates. Such observers can be constructed from observations of distant stars, 
for example. These observers have covariant velocity 


h= (t) = ih (e) = i (e + oo cosh(() é) i (14.332) 


and the condition that this is a unit timelike vector forces 


5 2Mr 34 14 
ale r2 +a? cos2(@)) ` (130a) 


The surface within which it is not possible to remain at rest is called the er- 
gosphere. For non-rotating black holes the horizon and ergosphere coincide. But 
for rotating black holes the ergosphere is defined by 


r? + a? cos? (0) — 2MF = 0. (14.334) 


This surface lies outside the horizon, and touches the horizon at the poles. In 
the intervening region it is impossible to remain at rest, but it is still possible 
to escape. One can think of this in terms of the angular momentum of the hole 
dragging observers around with it. 

To gain some further insight into the properties of the Kerr solution, consider 
circular orbits in the equatorial plane (9 = 7/2). For these we have 


(hH) = 2 — (7? + a2) g? — (i — ad)? = 1. (14.335) 


The fF derivative of this expression must vanish for a circular orbit, which tells 
us that 


r =M (; — a) ; (14.336) 
$ 


If we let Q denote the angular momentum measured by our set of preferred 
infalling observers (which are at rest at infinity), we have 


Q= - (14.337) 
It follows that, for circular orbits, 
M}/2 


For a given distance, there are two possible values of the angular velocity for 
circular orbits. The larger value of 2 is for a particle corotating with the black 
hole, and the smaller for a counterrotating orbit. Again, this effect can be 
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understood in terms of the black hole dragging matter around with it. The 
larger angular velocity for corotating orbits means it is possible to form stable 
orbits much closer to the event horizon than for the Schwarzschild case. 


14.7.5 The Dirac equation in a Kerr background 


As a final illustration of the utility of the Newtonian gauge form of the Kerr 
solution, we return to the Dirac equation. We first form 


M 2Mr \\/? 1 
Opw(b) = =. 14.339 
pah) ap (a) t7 Tacos(@) ( ) 
The Dirac equation in the Newtonian gauge can therefore be written 
o 1 
Vy — (2M7)? | 2 Sy ṣ +e 
PER por ies” 2 + a?)1/2(F — Ia cos(0 OM 
= —mwly3. (14.340) 


If we again multiply through by e+, we arrive at an interaction Hamiltonian of 
the form 


1/2 
Axy= cee (Pe nee +a 27) 1/4) 
ar'/? cos(ĝ) 


—a cos(0) a + a?)172 Fay 


o 
pi/4 oao (FY) i w), (14.341) 
where we continue to use i for the quantum imaginary. This Hamiltonian is 
(almost) Hermitian when integrated over flat three-dimensional space, because 


the measure in oblate spheroidal coordinates is 
dx = p° sin(@) dr dé dd. (14.342) 


Our form of the Kerr solution therefore does generalise the many attractive fea- 
tures of the Newtonian gauge for the Schwarzschild solution. As in the Schwarz- 
schild case, the Hamiltonian is not self-adjoint when acting on normalised wave- 
functions. For the Kerr case a boundary term arises at 7 = 0, which now defines 
a disc of radius a. 

The Dirac equation (14.340) is separable in spheroidal coordinates, though the 
details of this separation are quite complicated. One problem is that the angular 
separation constant depends on the energy. This makes scattering calculations 
far more difficult than in the spherical case, as the separation constant must be 
recalculated for each energy. A considerable amount of work remains to be done 
in extending the detailed understanding of quantum theory in a Schwarzschild 
background to the Kerr case. 


563 


GRAVITATION 


14.8 Notes 


Many of the applications discussed in this chapter are covered in greater detail 
in the papers by Doran, Lasenby, Gull and coworkers. The solution method 
described in this chapter was first proposed in the paper ‘Gravity, gauge theories 
and geometric algebra’ by Lasenby, Doran & Gull (1998). This method should 
be compared with the spin coefficient formalism of Newman & Penrose (1962). 
The advantages of the Newtonian gauge for spherically-symmetric systems have 
been promoted by a handful of authors, most notably in the papers by Gautreau 
(1984), Gautreau & Cohen (1995), and by Martel & Poisson (2001). 

The problem of the electromagnetic fields created by a point charge at rest 
outside a Schwarzschild black hole was first tackled by Copson (1928), who ob- 
tained a solution that was valid locally in the vicinity of the charge, but con- 
tained an additional pole at the origin. Linet (1976) modified Copson’s solution 
by removing the singularity at the origin to obtain the potential described in 
section 14.3.5. Similar plots to those presented in section 14.3.5 were first ob- 
tained by Hanni & Ruffini (1973), though these authors did not extend their 
plots through the horizon. A popular means of interpreting these plots in terms 
of effects entirely around the horizon is advanced in The Membrane Paradigm 
by Thorne, Price & Macdonald (1986). We believe that a better understanding 
is gained by considering the global properties of fields, both inside and outside 
the horizon. 

Scattering and absorption processes by black holes have been widely discussed 
by many authors. Summaries of this work can be found in the books by Fut- 
terman, Handler & Matzner (1988) and Chandrasekhar (1983), or the article by 
Andersson and Jensen (2000). The first attempt at a quantum calculation of the 
scattering cross section was by Collins, Delbourgo & Williams (1973), though 
their derivation did not employ a consistent perturbation scheme. The calcu- 
lation described in this chapter was first published in the paper ‘Perturbation 
theory calculation of the black hole elastic scattering cross section’ by Doran 
and Lasenby (2002). Classical and quantum absorption processes are discussed 
in detail by Sanchez (1977, 1978) and Unruh (1976). 

Cylindrical systems are discussed by Deser, Jackiw & ’t Hooft (1984) and 
Jensen & Soleng (1992). The properties of cosmic strings are described in Cos- 
mic Strings and Other Topological Defects by Vilenkin & Shellard (1994). The 
solutions described in this chapter were developed in the paper ‘Physics of rotat- 
ing cylindrical strings’ by Doran, Lasenby & Gull (1996). The form of the fields 
outside a rotating black hole was first discovered by Kerr (1963), and has been 
widely discussed since. A fairly complete summary of this work in contained in 
Chandrasekhar’s The Mathematical Theory of Black Holes (1983). The complex 
coordinate transformation trick for deriving the Kerr solution was discovered 
by Newman & Janis (1965), and later explained by Schiffer et al.(1973). The 
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uniqueness theorem for black holes was developed by Carter (1971) and Robin- 
son (1975). The analogue of the Newtonian gauge for the Kerr solution was 
discovered by Doran (2000). 

The applications of the gauge theory approach to gravity discussed in this 
chapter have concentrated on the simplest Einstein—Cartan theory. Modern de- 
velopments in quantum gravity have suggested a number of modifications to 
this theory. Two of the most common ideas include the introduction of local 
scale invariance, and the inclusion of higher order terms in the Lagrangian. The 
geometric algebra gauge theory approach is equally applicable in these settings. 
Some preliminary work on this subject is described by Lewis, Doran & Lasenby 
(2000). This field is developing rapidly, driven in part by developments in in- 
flationary theory and observations of the cosmic microwave background. These 
observations could well revolutionise our understanding of gravitation in future 
years. 


14.9 Exercises 
14.1 Spherical symmetry of the h function can be imposed by demanding that 
Rh, (RaR)R = h(a), 
where R is a constant spatial rotor (Re.R =e), and a’ = RzxR. Prove 
that this symmetry implies that the {e",e’} and {e°, e°} pairs decouple 


from each other. Show further that we must have 


h(6) = aĝ + Bd, 


(¢) = ad a 06, 


and explain why we can always set @ = 0 with a suitable gauge choice. 
14.2 The energy-momentum tensor for an ideal fluid is 


T (a) = (p + p)a-vv — pa. 


Show that covariant conservation of the energy-momentum tensor results 
in the pair of equations 


D-(pv) + pD-v =0, 
(p + p)(v-Dv)Av — (Dp) Av = 0. 


Give a physical interpretation of these equations. 
14.3 The Schwarzschild line element is defined by 


2M r 
2_ 2 P 992 2 2 2 
ds* = (1 p Jat ae dr“ — r“ d0“ — r° sin (0) dd”. 


Find the equation for the free-fall time as measured by radially-infalling 
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14.4 


14.5 


14.6 


14.7 


observers, starting from rest at infinity. Express the line element in terms 
of this new time coordinate to obtain the Painlevé-Gullstrand form 


2 
2 2 2M\ 292 | an2 2 
ds“ = dt“ — | dr + PA dt |} —r-(d0" + sin“ (0) d°). 


Prove that the total absorption cross section for a spherically-symmetric 
black hole of mass M is given by 

rTM? 
2u4 
where u is the incident velocity. 

The covariant electromagnetic field generated by a charge at rest on the 
z axis outside a Schwarzschild black hole is defined by 


F= AT ES Ô (e + y2M/rer), 


Oabs = (8u* + 20u? —1+ (1 a 8u?)3/2) 


Or r—2M 06 
where 
_ 2 L Nf ane 
V(r, 6) = q (r— M)(a — M) — M*cos*(6) qM 
ar D ar 
and 


1/2 


D = (r(r—2M) + (a — M)? —2(r — M)(a— M) cos(0) + M? cos? (0)) 


Prove that F is finite and continuous at the horizon. 
In calculating the scattering cross section from a black hole we need to 
compute the integral 


dk k? — p? , 
i= k . 
=j On pr- ik- pe 


Evaluate this integral by first displacing the origin in k-space by the 
amount (p; + p;)/2, and then introducing spheroidal coordinates 
kı = asinh(w) sin(v) cos(¢), 
k2 = asinh(w) sin(v) sin(¢), 
k2 = acosh(u) cos(v), 
where 0 <u<o,0<v<7,0<¢6< 27 and a= |q|/2. 
The Kerr—Schild form of the Schwarzschild solution is defined by 


- M 
h(a) = a+ —a-e_e_, e— = &t — €r. 

r 
Construct the Dirac equation in this gauge, and find the interaction 
vertex factor in momentum space. Calculate the differential scattering 
cross section for a fermion in this gauge, and verify that it is the same 
as found in the Newtonian gauge. 
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14.8 


14.9 


14.10 


14.11 


14.12 


14.13 


14.14 


14.15 


14.16 


Prove that det (h) is constant for spherically-symmetric vacuum gravi- 
tational fields. 
For a particle in a circular orbit around a Schwarzschild black hole, 
prove that the non-relativistic binding energy (as defined by the effective 
potential) is given by (G = c= 1) 

Mr—4M 
arr — 3M" 
Derive the full set of time-dependent radial equations with the cosmo- 
logical constant A included. 
A spherically-symmetric distribution of dust is released from rest, with 
the initial density distribution chosen so that streamlines do not cross. 
Prove that a singularity forms at the origin after a time 


Ey = 


where po is the central density. 

Solve the Dirac equation in a cosmological background with k Æ 0. Is the 
Dirac field homogeneous? Can you construct self-consistent solutions to 
this system of equations? 

Construct a matched set of interior and exterior gravitational fields 
around a rigidly-rotating cylindrical string. Do closed timelike curves 
exist in this geometry? 

Verify that the Kerr solution defined by equation (14.307) satisfies the 
vacuum field equations. 

The Riemann tensor for the Kerr solution can be written as 


M 
BN 2(r — Ia cos(8))* ELE, 


Prove that this satisfies 0,R(bAc) = 0 and interpret both parts of this 
result. 

The Newtonian gauge form of the Kerr solution involves the spheroidal 
coordinates 7 and 0. Prove that 7 = cos(0) = 0 defines a ring. 
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